MNIST Classification Example in NengoDL

yadams · September 20, 2019, 5:32pm

Hi Nengo team! I am relatively new to Nengo and SNNs (~2 months of reading documentation, following along with examples, etc.), but I have done a little more work with image classification and regular CNNs.

Main focus of this topic: Why does the exact same architecture/setup have such different training losses and test accuracies between MNIST and Fashion MNIST? I am aware that MNIST is essentially a “solved” dataset, and Fashion MNIST is more challenging.
My classification error after training with MNIST is 0.75% (same as if I used pretrained weights), and the training loss is 0.8347, which seems rather large for MNIST compared to similar architectures in TF/Keras.
However, the classification error I get after training with Fashion MNIST is ~33.31%, with a training loss of 61.3385.

The example is from the Jupyter Notebook here, but instead of loading in pretrained weights, I’ve set do_training = True to train it on my own.

Keeping all else the same, there were only a few changes I made to the code.

I imported each dataset from Keras, rather than via the urlretrieve method. This is because the latter method had a training set of 50000 samples, rather than 60000.
I increased n_steps from 30 to 100.
I wanted to use the entire test set to test the network, but the example reduced the number of test images to speed up the example. All I did was remove both minibatch_size*2 parts from the code below.

test_data = { inp: np.tile(test_data[0][:minibatch_size*2, None, :], (1, n_steps, 1)), out_p_filt: np.tile(test_data[1][:minibatch_size*2, None, :], (1, n_steps, 1))}

Why is there such a drastic difference in accuracy/loss between MNIST and Fashion MNIST?

Follow-up questions related to this classification example:

What was the rationale for configuring max_rates, intercepts, amplitude of the LIF neuron, and synapse of the out_p_filt probe to those specific values?
Is there any benefit to setting trainable=True in configure_settings? (“we could train [the nengo objects] if we wanted, but they don’t add any representational power.”)
Why use average pooling (as opposed to max pooling)?
How do you get test loss?

Thank you in advance for your help!

drasmuss · September 20, 2019, 6:27pm

Hi yadams!

I think the answer to this is just the point you raised above (“MNIST is essentially a “solved” dataset, and Fashion MNIST is more challenging”). That example is tailored to solve MNIST as simply/quickly as possible, it isn’t designed to be a general solution for classification. So I wouldn’t expect it to produce similar accuracy when applied to a different data set.

We typically set amplitude to 1/max_rates, so that the overall output of a neuron will be ~1. Other than that, those were just values that worked well after a small amount of manual hyperparameter searching.

In my experience there is no benefit in setting trainable=True in that example. But that would depend on the network; with other network structures (which make more use of Nengo objects, rather than relying on tensor_layers) then we would definitely want to be optimizing the Nengo parameters.

No strong reason, it’s just the one we picked for demonstration purposes in that example.

The test loss is being computed in that example, with e.g. these lines

print("error after training: %.2f%%" % sim.loss(
    test_data, {out_p_filt: classification_error}))

Hope that helps, let us know if you have any other questions.

yadams · September 20, 2019, 6:53pm

Thanks for the quick response. I do have one more clarification question.

The resulting output of that print statement is something like
Calculation finished in 1:15:42
error after training: 0.75%

So in this case, loss and accuracy are essentially opposite(?) ways to measure the error by the network? i.e. accuracy = 100 - error_after_training?

drasmuss · September 20, 2019, 9:14pm

Loss is a general term for the result of applying some evaluation function to the output of a network (https://en.wikipedia.org/wiki/Loss_function). It doesn’t refer to a particular function, so it isn’t really the “opposite” of anything.

“Accuracy” refers to a particular loss function (classification accuracy). It measures the number of correctly classified inputs. Classification error is the opposite of classification accuracy (the number of inputs classified incorrectly).

But you could have another loss function, such as mean squared error, that you also evaluate for your network. The mean squared error has no real relation to classification accuracy or classification error, it is just a different way of measuring the output of the network.

I’m not sure if that answers your question or not?