Binary Image Classification using SNN predicts all zeroes

Hi again, I am quite new to deep learning and nengoDL. I was following the “Spiking Neural Network using MNIST ” and the tensorflow code to nengo examples, and I tried to follow the same procedure to my dataset of 1616 audio spectrogram images that has to be classified as 0 or 1.

In the meanwhile, I trained a CNN model using the dataset and have achieved a 95% validation accuracy, and I wanted to observe the performance of SNN on this dataset. I initially started with a simple CNN model and a low epoch count, and using nengo_dl.converter() and activation=nengo.SpikingRectifiedLinear() I converted it to SNN and compiled and trained the model. The model’s accuracy seems sit on a range of 49-51% and upon using sim.predict() I get a numpy array of all zeroes. I’m pretty sure it has nothing to do with SNN, and it’s some common mistake.

I have also attached the ipynb file below for reference, and I would like someone to provide me suggestions and ways to find this fault. I have 1616 audio spectrogram images, 808 images labelled as ‘1’ and 808 images as ‘0’.

Thanks in advance :wink:

spiking__new.ipynb (83.4 KB)

Hi @JoshuaAlfred,

I didn’t have time to do a deep dive into your notebook, but from my quick look, there are definitely a few things that may be causing the issue.

Regarding the data labels, I see that you simply labelled the first half of the data as 1, and the second half of the data as 0. You’ll definitely want to check that the data itself is identically organized. If the (actual) labels for the audio spectrograms do not match the labels you have assigned it, your network will be unable to learn the association between the audio data and the label, which will result in an accuracy of about 50%. As a note, since your network has only 2 classes (label 0 or label 1), an accuracy of about 50% means that your network is essentially making a guess.

To check what your network is doing, you’ll also want to probe the output of a test run of your network. If your network is always guessing the same class (e.g., always outputting a 0), then for your specific problem, the test accuracy will be about 50% (because the network will get all of the 0’s correct, and all of the 1’s wrong)

As for the network itself, I would recommend that you construct, train and test your network solely in TensorFlow first (to make sure it runs to your desired performance in TensorFlow) before adding in the NengoDL parts to do the spiking network conversion.

It looks like your images have colour to them. I haven’t worked much with colour images, so I can’t give many recommendations here. You’ll definitely want to make sure that the convolutional layers are set up properly to work with the 3 channels of the colour image. Also, it may help to simply things by converting the coloured image to a grayscale image, although, I’m not sure how important the colour components are to the spectrogram. As far as I understand though, in a spectrogram, the colour just determines the value of that frequency component, so you should try to pre-process the data to train on that data instead of the colour values.

For the output of your network, typically, the last layer of the network (from the Nengo example, this is the dense layer) maps the activities of the previous layers on to the output classes. For the Nengo example, the MNIST dataset has 10 output classes (one for each digit from 0 to 9), so the output layer has 10 neurons / units. The activity of each neuron in the output layer determines how much the network “thinks” the input represents that specific output class. If we apply the same logic to your problem, the number of neurons in the dense layer should be 2. It seems like you are using a layer with a sigmoid function (in conjunction with the dense layer) to do this, but I’m not sure if this is achieving the desired output you want. It may be that removing the dense layer and connecting the dropout directly with the op layer will resolve your training issue.