Hi @hetP, and welcome to the Nengo forums!
I played around with your code and have some comments:
Regarding this observation, I believe there are multiple causes to this:
The data you are using contains two labels, 0, and 1, and I think this is what is intended for this dataset. However, the Nengo network is configured in such a way that it only produces a 1D (scalar) output:
# dense linear readout
out = nengo.Node(size_in=1)
I’m not sure if this was intentional, but if your data contains 2 classes, then the nengo.Node
here should have a size_in=2
. Making this change, you will find that the network then fails to train with Tensorflow throwing an error about mismatched array sizes. To fix this issue, I noticed that you were using the CategoricalCrossentropy
loss function instead of the SparseCategoricalCrossentropy
loss function. From the Tensorflow documentation, the CategoricalCrossentropy
loss function should be used if your labels are “one-hot”. This means that the output of the network would either be [1, 0]
for the first class, or [0, 1]
for the second class. This is not the case with your dataset (it’s 0
for the first class, and 1
for the second class), so in this instance, you want to be using the SparseCategorialCrossentropy
(see documentation here).
Making these changes will result in a network where the probe accuracy increases with the number of epochs, but I did notice that no matter what I tried, the validation accuracy was still about 50%. I’m not an expert with using the LMU on various datasets, so perhaps @arvoelke or @Eric can chime in with suggestions on how to improve the network. My thought is that either the LMU is too good and immediately overfitting the data, or that perhaps some pre or post processing layers (dropouts maybe?) are needed.
There are, unfortunately, not many sources describing how to tune these LMUs. I would however, refer you to our KerasLMU python package which is a Tensorflow-native implementation of the LMU (i.e., no Nengo needed). The KerasLMU documentation includes links to the original LMU paper, as well as an API reference with descriptions on what each of the parameters of the LMU are meant to do.