Using the LstsqDrop() solver and the "weights" parameter

ChielWijs · February 21, 2022, 2:05pm

Hi all,

I want to sparsely connect two ensembles, as their “human functional counterparts” are also sparsely connected. Lets say that the connectivity between the two is 5%; i.e. each neuron in the pre ensemble connects to 5% of all neurons in the post ensemble.

I looked to find out how to do this and the LstsqDrop() solver seems like a good candidate to me. As, per default, the nengo solvers solve for decoders, I am not sure if specifying drop=0.95 then actually reflects the sparse connectivity as I just outlined in the previous paragraph. I think this as it performs a drop on what seems like to me the connection from a neuron to a (number of the) dimension(s) of the vector representation (i.e. one of the decoder values).

Could anyone explain to me wether I would have to set the weights parameter to True to implement the sparse connectivity as I described it? I tried to work it out with the describtion of how decoded connections pass along information. I believe that this is done as shown in Stewart (2012), Figure 2b, but I cannot wrap my head around how dropping some of the decoder weights (to zero) affects the final connectivity.

Perhaps also relevant is that this connection will later be trained by using the Voja() learning rule. Will the connection then still preserve its sparse connectivity? Is this dependent on the weight parameter?

Thanks for any thoughts,

Chiel

xchoo · February 25, 2022, 5:26am

Hi @ChielWijs,

Before I delve into the Nengo related stuff, I’d like to provide my own interpretation of the understanding of sparse connectivity. When it is stated that the connectivity between two populations of neurons is some percentage, I usually take it to mean that the percentage of connectivity is with respect to a fully connected network. That is to say, if every neuron in the “pre” population is connected to the “post” population, the two populations would be 100% connected. If 50% of those connections are missing (or, in mathematical terms, set to 0), then the two populations would be 50% connection. Note the distinction between my statement, and what you want to achieve. Namely:

In your statement, the overall connectivity percentage is the same as in my description, however, your description imposes an additional constraint that each neuron has the same connectivity percentage (rather than it just being w.r.t the entire population).

The paper (and in the NEF book), illustrate that the connection weight matrix between two neural ensembles can be computed from the encoders (\textbf{e}) and decoders (\textbf{d}) using the formula:

\omega_{ij} = \textbf{d}_i\textbf{e}_j

We can illustrate this using the following diagram:
\begin{matrix} \textbf{d}\downarrow \textbf{e}\rightarrow& \begin{bmatrix} 1 & 1 & 1 & 1 & 1 \end{bmatrix} \\ \begin{bmatrix} 1 \\ 1 \\ 1 \\ 1 \\ 1 \end{bmatrix} & \begin{bmatrix} 1 & 1 & 1 & 1 & 1 \\ 1 & 1 & 1 & 1 & 1 \\ 1 & 1 & 1 & 1 & 1 \\ 1 & 1 & 1 & 1 & 1 \\ 1 & 1 & 1 & 1 & 1 \end{bmatrix}\\ \end{matrix}

We can now extrapolate how sparse decoders will affect the full connection weight matrix. Let’s set all decoders except for the first row to 0:

\begin{matrix} \textbf{d}\downarrow \textbf{e}\rightarrow& \begin{bmatrix} 1 & 1 & 1 & 1 & 1 \end{bmatrix} \\ \begin{bmatrix} 1 \\ 0 \\ 0 \\ 0 \\ 0 \end{bmatrix} & \begin{bmatrix} 1 & 1 & 1 & 1 & 1 \\ 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 \end{bmatrix}\\ \end{matrix}

What does this result tell us? It tells us that by setting a bunch of decoders to 0, the result is that the neurons with 0 decoders aren’t connected to anything at all. And the one neuron with a decoder is fully connected to all the neurons in the “post” population. This is… not what you want to achieve (according to your description)

So… what does the weights=True setting on the solver do? It will set the lowest X% of the full connection weights to 0 (and then re-solve the weights). This will accomplish my interpretation of connection sparsity, but there is a caveat. You mentioned:

The Voja rule operates specifically on the encoders of the post population, which in turn requires the connection to use encoders (and decoders) instead of the full weight matrix. This means that using weight=True when calling the solver will not work in this case. It might be possible to generate the weight matrix, sparsify it, then factor it out into encoders and decoders, but that is (in my opinion) mathematically hard to do (or may even be impossible). However, it should be possible to use the Oja rule (on which the Voja rule is based on) with the full (sparsified) weight matrix.

As to whether the learning rule will preserve the sparsity? Technically, even 0 weights can be increased (or decreased) based on the learning rule behaviour. This means that, in theory, the learning rule will not respect the sparsity of the weight matrix. However, in practice, it may be that the 0 weights lead to 0 spikes, which then result in no change to the weight matrix (which then preserves the sparsity). The only way to know for sure is to experiment with it.