Location modules: Early findings¶

Marcus Lewis, 2017-06-23

In March I taught a sensor how to infer locations and objects using this 3-layer approach:

In this experiment, I use the same fundamental approach, but I implement the "location layer" differently. The layer uses principles of grid cells. It breaks the "location SDR" into a bunch of small modules. Each of these modules do path integration independently, and the population of cells represents a unique location.

A location module is a group of cells that each fire at a different sensor location. You can visualize a location module by sorting the cells by their firing fields. Here are 25 cells mapped onto the locations where they fire. The activity of the 5 bold cells are shown as the sensor moves:

This chart assigns each cell an "excitation". This excitation might be a firing rate, a membrane potential, or some other continuous variable.

If the sensor goes past the edge of a location module, it wraps around to the other end. So the individual cells have grid-like firing fields.

I built a "location layer" out of 18 location modules. Similar to grid cell modules, each module has a different scale and orientation -- i.e. the squares are sized and rotated differently. Together, the modules create novel location SDRs without having to learn any new path integration.

I didn't model a specific neural path integration mechanism. Instead, I simulate the result of path integration. I don't specify how cells do the computation: "Cell 3 is active" + "The sensor moves to the right" => "Cell 4 is active". I just assume that it works, and I look at what we can do with it.

Here are some early findings.

Finding 1: Precision of inferred locations should be honest.¶

Location cells can be driven by either:

Processed motor input
- (Details not discussed here.)
Processed sensory input
- From the "feature-location pair" layer

The processed sensory input drives the location cells whenever path integration doesn't do the job. This always happens on the first sensation of an object, and it can happen after path integration errors.

When the processed sensory input activates a location cell, the cell becomes active because it was active on previous occurrences of this sensory input. The layer essentially "restores" the cell activity of the previous occurrences. But there's a problem here: we can't restore the precise cell activity.

Unless synapses are weighted to restore the exact cell activity, this is going to happen.

On subsequent motions, the location module will often activate a different cell than would have been activated had the each cell's activity been restored precisely to its previous state.

In the recording below, the input layer often bursts after a few touches. This happens because these off-by-1 active cell errors happen often enough to prevent cells in the input layer from being predicted.

import htmresearchviz0.IPython_support
from htmresearchviz0.IPython_support import printLocationModuleInference
htmresearchviz0.IPython_support.init_notebook_mode()

with open("logs/1-points-25-cells.log", "r") as fileIn:
    printLocationModuleInference(fileIn.read())

This can keep the network from ever inferring the object.

There are a few possible solutions to this problem:

Don't try to solve the problem in the location layer. Instead, lower the prediction threshold in the input layer, increasing the false positive rate.
Use precise synapse weights to restore the precise original activity.
Use a denser location SDRs. Activate more than one cell per module. This would make the problem less dramatic - each module will still have an occasional missing cell, but most cells that should be active will be active.
Activate the same number of cells, but activate an "imprecise" location. Subsequent movements will often activate multiple cells for each active cell.

In the recording below, the problem is solved. Since we activate an imprecise location, we rarely miss a cell activation. Cells are rarely inactive incorrectly, so the input layer always has predicted cells.

with open("logs/9-points-25-cells.log", "r") as fileIn:
    printLocationModuleInference(fileIn.read())

Generally, a "Move" causes more cells to activate, but the firing fields don't become more dense. Try hovering over a cell before the first "Move", and then click. More cells become active, but the firing fields are unchanged. (Although you'll see the fields shift if you're hovering over a cell that becomes inactive, because this causes the visualization to select a different cell.)

Finding 2: In this experiment, inference improves with more cells per module...¶

By definition, adding cells to a location module increases the precision of each location SDR. On its own, this allows the network to distinguish objects with differences at smaller scales. But there's a second effect: the margin of error for each inferred location is smaller, and hence inference is faster.

with open("logs/9-points-100-cells.log", "r") as fileIn:
    printLocationModuleInference(fileIn.read())

...but that's partly due to this experiment setup.¶

Adding cells improved time-to-inference because the sensory input caused a sparser activation in the location layer. But it only had this effect because this experiment uses a few specific points on each object.

This experiment treats objects as a bunch of equally-sized squares. The sensor only touches one point on each of these squares: the center. During training and during testing, it never touches any other point. So, although the "36 cell" map above shows "B" in the firing fields of 4 different cells, sensing "B" only activates one cell: the one that contains the center of "B".

If the sensor had learned multiple points in each square, then sensing "B" would activate all 4 of these cells.

With more cells per module, the sparsity will be lower at the end of inference, but it will not generally be lower at the beginning of inference. With a well-learned object, the percent of cells in a module activated by a feature will be roughly constant, independent of cell count.

Next steps¶

Characterize this model:

What's the object capacity? How does it vary with:
- Number of unique features
- Size of objects
- Number of location modules
- Cells per location module
- The scales / orientations of the location modules
How well does it handle objects with different-sized features?
- Do the small-scale modules cause problems by activating large unions?
- Do the large-scale modules cause problems by activating the same cell for lots of locations?
Can I craft combinations of objects that will confuse this model? If so, does this demonstrate an issue that will be widespread?

Add more to this model:

Add more cortical columns. Rely on the "object" layer's long-range connections. How much does this improve inference? Does it just make it faster, or does it make it truly more capable?
Add proprioception. With multiple sensors, the relative location of the sensors is valuable information for inference.