Image Classification and Learning with Semantic Pointers

I am new to NEF and SPA, having only went through the lectures of the Nengo Summer School and tried some basic Nengo tutorials. As I came from a Deep Learning background, I want to make a simple MNIST classifier using Spiking Neural Networks. I see that there are examples which convert tensorflow models into Nengo models for MNIST, but is it possible to achieve this without tensorflow and use STDP learning rules instead? It will be great if anyone can point me to some literature on this.

Hello @yedanqi ! Welcome to our community!

From your question description it seems that you are looking for semantic pointers based methods for image classification using local learning rules e.g. STDP, and not going through the Tensorflow way (i.e. the ANN-to-SNN conversion).

Few points I have:

  • STDP in its basic form is an unsupervised local learning rule. MNIST image classification is basically a supervised problem, so STDP alone might not suffice there.
  • Moreover, for building a reasonable classifier with a network of neurons (spiking or non-spiking), you might want to have multiple layers in the network, and training multiple layers in the network isn’t straightforward with local learning rules. Backprop is obviously the winner here.
  • PES is a local supervised learning rule which works well with shallow SNNs. More details here. You may find it useful.

You may also want to check SPAUN where you can find details on MNIST image classification and semantic pointers. Following is a line from the SPAUN link w.r.t. MNIST images:
The image impacts the visual system, and the neurons transform the raw image input (784 dimensions) to a lower dimensional (50 dimensions) Semantic Pointer that preserves central visual features. This is done using a visual hierarchy model (i.e., V1-V2-V4-IT) that implements a learned statistical compression.

1 Like

Hi @zerone, thanks for the pointers and explanations! I will (hopefully) take a few days to digest and wrap my head around how SPAUN achieves its tasks. I find SPAUN pretty impressive, but I probably lack the reading to understand what is happening under the hood now. For now, I am working through this example. Since there is a chance that my understanding of SPA is incorrect, I would like to check if the following statements are accurate/precise.

  1. In the following code:
with nengo.Network(seed=3) as model:
    a = nengo.Ensemble(n_hid, n_vis, **ens_params)
    v = nengo.Node(size_in=n_out)
    conn = nengo.Connection(
        a, v, synapse=None, eval_points=X_train, function=T_train, solver=solver

a and v are Semantic Pointers, where a represents the image and v represents the concepts of 0, 1, 2, 3, …, 9, which in MNIST case is represented by one-hot-encoded vectors. In this sense, the OHE vectors are Semantic Pointers as well, just that v is an approximation of it.
2. The learning rule in this example is PES, and it is learning the one-hot-encoding function that converts Semantic Pointers a to v. The solver is used to solve the decoders according to the PES learning rule.
3. This example is showing how different encoders affect how well we learn the decoders, and sparse Gabor filters are best. It seems like Gabor filters are like feature extractors, much like what convolution layers in CNN are doing.

Hello @yedanqi, I too am not an expert on SPAUN; hopefully, others can be of better help to you. WRT your questions:

Not actually. a is just a high-dimensional Ensemble of non-spiking neurons here (note the neuron_type is LIFRate()) which is used to represent the high dimensional MNIST images, through its LIFRate() neurons activations. In its basic form, such a high dimensional image representation is similar to the representation of a one-dimensional signal through a one-dimensional Ensemble.

Now, the output vector of this Ensemble a can be considered as the Semantic Pointer. As you might have read by now, Semantic Pointers are simply high dimensional vectors with some conceptual meaning to it. Similarly, the 10 dimensional output vector from Node v can be considered as the Semantic Pointer.

Nope! There is no “learning” per-se involved here. Agreed that the decoders/weights are learned but they are just one-step calculations using the Least Squares method with L2 regularization. Just a linear mapping/transformation is learned between the output from a and the OHE vector v. PES is not used here. Note the PES is based on Delta rule, where the weights are learned in an iterative manner (and not one-step).

Correct! You can account for the encoders as neuron’s preference vector i.e. in which direction (in a vector space) the neuron is most sensitive to? E.g. in case of one-dimensional vector space, a neuron with encoder value of -1 is sensitive to negative values i.e. it will fire spikes if a negative value is input as stimulus; similarly a neuron with encoder value of +1 is sensitive to positive values, i.e. it will fire spikes if a positive value is input as stimulus. In general, the current J fed to the neurons in a Nengo Ensemble is J = \alpha \times <e, x> + J_{bias} where <.> is the dot product operator and \alpha, e, x, J_{bias} are the neuron’s gain, encoder, input stimulus, and bias current respectively. Note that for e=-1 and negative x, the dot product <e, x> is positive, thus a positive current is input to the neuron sensitive to negative values.

In accordance with above, the Gabor filters as encoders ensure that the high dimensional neurons are sensitive to those patches of the images where the neurons find a strong correlation (i.e. <e, x>) of the Gabor filters (i.e. e) with the local 2D arrangement of pixel values (i.e. x).

I suppose @xchoo or other Nengo developers can correct me if I am wrong anywhere.

Hi @zerone, really appreciate your explanations :slight_smile:. With regards to the neuron type in this example, I changed the neurons back to the LIF() neurons and seemed to achieve a higher accuracy. Is there any reason why the example chose to use LIFRate() neurons instead (I see in the documentation that LIFRate() is not a spiking neuron).

I think I can see now that there is no learning involved in this example, just the Least Square solver solving for the optimal decoder that maps encoded representations of the images into the OHE vector. I guess MNIST is pretty simple so one ensemble of neurons works to achieve pretty good results. I wonder if there is anyway I can make it deeper to learn more complex dataset like the Fashion MNIST.

Hey @yedanqi, I am actually surprised that spiking LIF() achieved better accuracy! Didn’t expect that!!

The reason non-spiking LIFRate() was chosen could be because the authors of this tutorial didn’t want to do training in multiple time-steps, but in just one time-step. So they needed a rate neuron activation function e.g. nengo.LIFRate() or nengo.RectifiedLinear() which outputs a value in one step only - It’s just like a solving a mathematical equation with real continuous values.

Well… if you use LIFRate(), then you can certainly use Back-prop natively to train deeper layers. If you use spiking LIF(), then Surrogate Gradient Descent based methods is one way. PES natively can’t be used to train deeper spiking layers. There could be other training methods too, but I am just starting on them… and novice myself. Not to mention, effectively training SNNs directly is an active area of research.

Few sources I have for you:

which I am planning to go through myself. I am afraid, I can’t be of more help. Let’s wait for the other ABR experts here for their inputs.

Hey @zerone, thanks for the resources. There are some other papers that I found that trains with some form of BP, which I haven’t got into reading them yet, but I will just share here since we are discussing this :slight_smile:.

Temporal Spike Sequence Learning via Backpropagation for Deep Spiking Neural Networks | Papers With Code

Enabling Deep Spiking Neural Networks with Hybrid Conversion and Spike Timing Dependent Backpropagation | Papers With Code

It is exciting that training SNN is an active area of research and that training methods are starting to appear in ICLR and NeurIPS.

Thank you @yedanqi for sharing these papers! Yes… they are gradually getting accepted within the NeurIPS and other traditionally ANN-based communities, however… the adoption is still sporadic. :crossed_fingers: for widespread adoption!