Hi all,
I know that for training a spiking network using nengo.Converter, ‘‘the basic idea is to use a differentiable approximation of the spiking neurons during the training process, and the actual spiking neurons during inference’’.
But in the example provided in the user guide, (https://arxiv.org/pdf/1611.05141.pdf), it is working on some picture data, rather than time series.
So I an wondering how the nengo.Converter will train the network if the input shape is (batch size, time step, feature size) and time_step is greater than 1? I mean, in this case, what will be the differentiable approximation of this network? (as I personally think the approximation network cannot be dynamic) Will this input matrix been ‘thrown’ into the network ‘step by step’ or will the input layer be automatically ‘flattened’?
Thank you very much!
Hi @zhexin, and welcome to the Nengo forums!
I’m not entirely sure what you are referring to, as there is no Converter
object in Nengo (core). I will assume you are referring to the nengo_dl.Converter
object? In which case, the typical use-case for the nengo_dl.Converter
is to convert a trained Tensorflow model into a spiking neuron model that can be run in Nengo (since Tensorflow doesn’t natively simulate or support spiking networks). In this process, the model is trained entirely in Tensorflow (or trained in Tensorflow using the NengoDL interface), and once the model has been trained to a satisfactory accuracy, it is converted (using the nengo_dl.Converter
) into a purely Nengo model that runs with spiking neurons.
Since the training is done with Tensorflow, the process of training the network to work with time series data (as opposed to static data) is the same in both Tensorflow and NengoDL. You can refer to our NengoDL LMU network example for a reference on how this is done. Note that for the network to be effective, you’ll need some sort of memory (e.g., LMU, LSTM) within the network.
I’m not entirely sure what you are asking here. For both the time-series and static data training processes, the differentiable approximation would be the same: the rate-based neuron model. If your network is configured to use the LIF neurons, then you can use the SoftLIF neuron model as per the paper.
It really depends on how you want to train the network, but both approaches are viable. If you want the network to work on the entire time-series at the same time (i.e., it has access to the whole time-series), then you can configure the input as one giant flattened matrix (or array). But, you can also have a network which is stateful that operates on the time-series data on each timestep, as is done in the LMU example.
What a nice answer! Thank you very much!