I am very new to Nengo and my first project is to experiment with supervised learning methods on the Spiking Heidelberg Digits dataset.
The dataset consists of audio files of spoken digits converted into spike trains and the corresponding labels indicating the correct digits. Of course the goal is to solve the classification task i.e. given the input spikes, figure out the digit.
For each training/test example, the spike trains are given in the form of two vectors of the same length, times and units. Times contains the time moments when a neuron spiked, and units indicates which neuron spiked, among 700 neurons in total.
In other words:
for i in len(times), units[i] produced a spike at time times[i]
There is no symmetry w.r.t. time
- examples have variable total lengths, roughly between 0.6 and 1.1 seconds.
- there is no fixed sampling rate i.e. when I take all time moments in the dataset and sort them, the pair-wise distance between moments is not fixed, and there is no common divisor, so I cannot force a fixed time step.
I am going through the documentation and I am struggling to come up with a method to implement this stimulus for what will be my classifier network.
Any ideas and suggestions will be much appreciated.
Even though there’s no fixed sampling rate, you can still bin the spike times into timesteps. One way to do this would be to use the
import numpy as np
# I'm creating some random times as demonstration data,
# but you'd use the spike times for each neuron
times = np.random.uniform(size=200)
dt = 0.001
max_length = 1.1
timesteps = dt * np.arange(int(max_length / dt))
spikes_per_timestep, _ = np.histogram(times, bins=timesteps)
When you bin each example, you can use your maximum length of 1.1 seconds, and any examples that don’t spike after e.g. 0.6 seconds will just have zero spikes after that point.
Hi there @Eric , thanks for your reply and sorry for taking so long to get back to you! I understand your solution and I think it is the best I can do with this dataset.
I have now realised that my troubles are due to the nature of this dataset. It is given as spike trains directly, so I cannot directly apply the techniques from the classification task examples in the documentation, such as Classifying MNIST digits with a spiking neural network or CIFAR-10 classification convolutional network with NengoLoihi.
For example in the MNIST example, the time element is incorporated by passing in the training images/labels for one time step and tiling the test images/labels for a number of time steps.
In my case, my data already have the time dimension in them. Instead of pixels x pixels, I have units x timesteps.
So my follow-up question is, is there any example I can look at where the input is given in spike trains and the task is a classification one?
Many thanks in advance and apologies if I am breaking any rules by nesting questions
Hi @Christos-14, as far as I’m aware, there aren’t any publicly available examples of Nengo networks doing classification directly on spiking input data. Typically, some sort of preprocessing is done on the spike data (one example is the histogram binning suggested by @Eric), and then classification is done on the preprocessed result (in the case of the histogram binning, the classification would be done on the histogram).
If you want to do the classification on the spike data itself, that’s a research question that’s beyond my expertise. @tbekolay did his PhD work on audio signal processing, so you may find some useful information there. The link to the GitHub repo containing his PhD work is here!
As a side note, developing networks with machine learning is often a fair bit of trial-and-error. You could try feeding the spiking input directly into a multi-layer LSTM network and see if that works. @tbekolay’s thesis will probably have some useful information on what the best networks to use for audio data classification are.
Even if you apply the binning I suggested, I would still call the result “spiking data” since the resulting data will still be quite sparse in time, assuming your bin width is relatively small (e.g. the 1 ms that we use as our standard timestep in Nengo).
There’s no reason that you can’t use that spiking data as an input to a network and train on it. In the CIFAR-10 network you linked to, I’ve got a layer called “input-layer” whose purpose is to take the non-spiking images and turn them into spiking data using
SpikingRectifiedLinear neurons. So you could just remove that layer since you’ve already got spikes.
One thing to keep in mind when training on spiking data is that it will be slower, just because you’ve got this additional “time” dimension. So rather than training on e.g. a batch of images of shape
(batch_size, height, width), you’ve now got a batch of shape
(batch_size, timesteps, height, width), which is many times larger. One trick you could try if you’re finding things really slow is doing initial training at a larger timestep (e.g. 10 ms) so your network can do some basic learning, and then going to a shorter timestep to “fine tune”. This is like moving your network more towards a rate-based network initially, which will help for both speed and possibly convergence too.
On the topic of convergence, you’ll want to make sure you have appropriate filtering on the output of the model, and perhaps within the model too, and even on the input spike trains themselves. That will help mitigate the variance (a.k.a. “noise”) that comes from using spikes. You can also use rate neurons initially (e.g.
nengo.RectifiedLinear) so that you’re not worrying about spikes in your network for initial training. Convergence issues are most prevalent early on in training, so the easier you can make things at the start, the better, until your weights get settled enough that you can add in these additional complexities.