Is it possible to use SNN as Q_function approximation?


#1

Hello everyone,

I am now doing a project, our team try to solve a robot control problem, we will use reinforcement learning to solve it.
And for the algorithm we will use Q_learning, and for the Q_function approximation we will try to use spiking neural network.

As we all know, usually we will use artificial neural network as Q_function approximation, so the idea is directly replace the ann by our spiking neural network.

But it does not work, and we test our algorithm on normal artificial neural network, it works well.

So my question is,
1, can we use snn as Q-function approximation.
2, If not, what is the standard way to solve reinforcement learning using spiking neural networks.

best greetings


#2

You can check out this thread for links to various papers on RL in Nengo, as well as code examples.


#3

If you’re doing robot control, you may be more interested in @travis.dewolf’s adaptive control work?


#4

Hello drasmuss,

Thanks for your answer, actually we have already make the robot work if we use normal artificial neural network as Q_function approximation. So it will be easy for us, if we can replace the ANN by Spiking neural network.

And now we try to use nengo_dl, because this simulator is quit similar to the ANN framework.
We have few questions, hope you can help us:
1, I do not understand the “minibatch_size” in nengo_dl simulator. For example, the mnist dataset has training images 60000, when I start the training, I will feed whole data into nengo model. I should set the minibatch_size to 60000, or I can give it a arbitrary number, for example 60?
And there is another argument n_epochs, I guess we can give a arbitrary number to minibatch_size, for example 60, and if we train the n_epochs = 1000, the whole dataset 60000 will be trained, am I right?

2, After training, I print the sim.loss, it is 0.1. But when I do prediction, the accuracy is very low, so I think I did’t find the rights way to do prediction. Please tell me the right way to do it. My code is there:

import nengo
import nengo_dl
import numpy as np
import tensorflow as tf

class Spiking_Qnetwork:

def __init__(self, input_shape, output_shape, nb_hidden, weights_path):
    '''
    Spiking neural network as the Q value function approximation
    :param input_shape: the input dimension without batch_size, example: state is 2 dimension, 
    action is 1 dimenstion, the input shape is 3.
    :param output_shape: the output dimension without batch_size, the dimenstion of Q values
    :param nb_hidden: the number of neurons in ensemble
    :param weights_path: the path to save all parameters
    '''
    self.input_shape = input_shape
    self.output_shape = output_shape
    self.nb_hidden = nb_hidden
    self.weights_path = weights_path

    self.model = self.build()

def encoder_decoder_initialization(self, shape):
    '''
    :return: initialised encoder or decoder
    '''
    rng = np.random.RandomState(1)
    coders = rng.normal(size=shape)
    return coders

def build(self):
    encoders = self.encoder_decoder_initialization(shape=(self.nb_hidden, self.input_shape))
    decoders = self.encoder_decoder_initialization(shape=((self.nb_hidden, self.output_shape)))
    model = nengo.Network(seed=3)
    with model:
        self.input_neuron = nengo.Ensemble(n_neurons=self.nb_hidden,
                                           dimensions=self.input_shape,
                                           neuron_type=nengo.LIFRate(),
                                           intercepts=nengo.dists.Uniform(-1.0, 1.0),
                                           max_rates=nengo.dists.Choice([100]),
                                           encoders=encoders,
                                           )
        output = nengo.Node(size_in=self.output_shape)
        self.output_p = nengo.Probe(output)
        nengo.Connection(self.input_neuron.neurons,
                         output,
                         synapse=None,
                         transform=decoders.T
                         )
    return model

def training(self, input_data, label, batch_size, nb_epochs):
    sim = nengo_dl.Simulator(self.model,
                             minibatch_size=batch_size,
                             step_blocks=1,
                             device="/gpu:0"
                             )

    sim.train({self.input_neuron: input_data},
              {self.output_p: label},
              tf.train.MomentumOptimizer(5e-2, 0.9),
              n_epochs=nb_epochs
              )
    sim.save_params(self.weights_path)
    sim.close()


def predict(self, input_data, batch_size=1):
    sim = nengo_dl.Simulator(self.model,
                              minibatch_size=batch_size,
                              step_blocks=1,
                              device="/gpu:0")
    sim.load_params(self.weights_path)
    sim.step(input_feeds={self.input_neuron: input_data})
    output = sim.data[self.output_p]
    sim.close()
    return output

if name == ‘main’:

from keras.datasets import mnist
from keras.utils import np_utils
from sklearn.metrics import accuracy_score

(X_train, y_train), (X_test, y_test) = mnist.load_data()
# data pre-processing
X_train = X_train.reshape(X_train.shape[0], -1) / 255.  # normalize
X_test = X_test.reshape(X_test.shape[0], -1) / 255.  # normalize
y_train = np_utils.to_categorical(y_train, nb_classes=10)
y_test = np_utils.to_categorical(y_test, nb_classes=10)

X_train_ = np.expand_dims(X_train, axis=1)
X_test_ = np.expand_dims(X_test, axis=1)
y_train_ = np.expand_dims(y_train, axis=1)
y_test_ = np.expand_dims(y_test, axis=1)


model = Spiking_Qnetwork(input_shape=28*28,
                         output_shape=10,
                         nb_hidden=1000,
                         weights_path="/home/huangbo/SpikingDeepRLControl/huangbo_ws/"
                                      "networks/saved_weights/snn_weights")

# training
model.training(input_data=X_train_, label=y_train_, batch_size=10000, nb_epochs=5)

output = model.predict(batch_size=X_test.shape[0], input_data=X_test_)
prediction = np.squeeze(output, axis=1)

# evaluate the model
from sklearn.metrics import accuracy_score
acc = accuracy_score(np.argmax(y_test, axis=1), np.argmax(prediction, axis=1))
print "the test acc is:", acc

#5

Just to be clear, the papers in that thread I linked to all use spiking neural networks to approximate the Q function.

There is a good explanation of the batch/minibatch/epoch terminology of machine learning here.

You need an input node in your model in order to pass in input_data (you can’t pass it directly into an Ensemble). So you’d need something like

with model:
    input_node = nengo.Node(...)
    nengo.Connection(input_node, self.input_neuron.neurons, ...)

...

sim.train({input_node: input_data}, ...)

Right now nengo_dl is just ignoring invalid input feeds – it should give an error to help identify problems like this, I’ll make that change. Thanks for bringing that to light!

Also, keep in mind that if you want to train something with gradient descent (e.g., tf.train.MomentumOptimizer), then your network needs to be differentiable (which nengo.LIFRate is not). However, you can try training with nengo_dl.SoftLIFRate (a differentiable approximation of LIF neurons), and then switching to spiking neurons when you do your prediction.


#6

Hello drasmuss,

Thanks for your quick answer, it is really helpful.
First I change my neuron type to SoftLIFRate, and I add a input node to get the input.
There are still some thing confused me,
1, the training time.
One of the strength of nengo_dl is using tensorflow as backend, so that make the training very fast. But I found the training time actually is very long. For example the mnist case, when I use normal nengo simulator and NEF to train a nengo network, I directly input whole mnist dataset into the network, the training takes 17s. Now I use nengo_dl, the trianing time is depend on the number_epochs we set. If we set batch_size as 60, and for 60000 mnist dataset we will train it 1000 epochs. It takes 8 hours.
2, save the parameters.
I think, this function is very similar as we do in tensorflow, we can’t change the architecture of network, and this function will save all the trainable parameters. Is that right?
3, prediction
After the training, we want to test the performance of the network. We want to input a single image. So we change the batch_size of simulator to 1, and run one step of simulation to get the output, we not sure if this is the right way to do prediction.
sim.step(input_feeds={self.input_neuron: input_data})
output = sim.data[self.output_p]

And in normal nengo, we do this to get the output of network:
_, acts = nengo.utils.ensemble.tuning_curves(input_neuron, sim, inputs=input)
The reason we ask is, in nengo_dl, after training, the loss is very small, but the prediction performance is very bad, it likes nothing is trained.


#7

The new code is like this:
import nengo
import nengo_dl
import numpy as np
import tensorflow as tf

class Spiking_Qnetwork:

def __init__(self, input_shape, output_shape, nb_hidden, weights_path):
    '''
    Spiking neural network as the Q value function approximation
    :param input_shape: the input dimension without batch_size, example: state is 2 dimension, 
    action is 1 dimenstion, the input shape is 3.
    :param output_shape: the output dimension without batch_size, the dimenstion of Q values
    :param nb_hidden: the number of neurons in ensemble
    :param weights_path: the path to save all parameters
    '''
    self.input_shape = input_shape
    self.output_shape = output_shape
    self.nb_hidden = nb_hidden
    self.weights_path = weights_path

    self.model = self.build()

def encoder_decoder_initialization(self, shape):
    '''
    :return: initialised encoder or decoder
    '''
    rng = np.random.RandomState(1)
    coders = rng.normal(size=shape)
    return coders

def build(self):
    encoders = self.encoder_decoder_initialization(shape=(self.nb_hidden, self.input_shape))
    decoders = self.encoder_decoder_initialization(shape=((self.nb_hidden, self.output_shape)))

    model = nengo.Network(seed=3)
    with model:
        nengo_dl.configure_trainable(model, default=False)
        model.config[nengo.Ensemble].neuron_type = nengo_dl.neurons.SoftLIFRate()
        model.config[nengo.Ensemble].gain = nengo.dists.Choice([1])
        model.config[nengo.Ensemble].bias = nengo.dists.Uniform(-1, 1)
        model.config[nengo.Connection].synapse = None

        self.input_node = nengo.Node(size_in=self.input_shape)
        layer = nengo.Ensemble(n_neurons=self.nb_hidden,
                               dimensions=self.input_shape,
                               encoders=encoders,
                               )
        nengo.Connection(self.input_node, layer)

        output = nengo.Node(size_in=self.output_shape)
        self.output_p = nengo.Probe(output)
        conn = nengo.Connection(layer.neurons,
                         output,
                         transform=decoders.T
                         )
        model.config[conn].trainable = True
    return model

def training(self, input_data, label, total_nb_dataset, batch_size, nb_epochs=None):
    if nb_epochs==None:
       nb_epochs = total_nb_dataset//batch_size
    sim = nengo_dl.Simulator(self.model,
                             minibatch_size=batch_size,
                             step_blocks=1,
                             device="/gpu:0",
                             seed=2,
                             )

    sim.train({self.input_node: input_data},
              {self.output_p: label},
              tf.train.MomentumOptimizer(5e-2, 0.9),
              n_epochs=nb_epochs
              )
    sim.save_params(self.weights_path)
    sim.close()


def predict(self, input_data, batch_size=1):
    sim = nengo_dl.Simulator(self.model,
                              minibatch_size=batch_size,
                              step_blocks=1,
                              device="/gpu:0",
                              seed=1)
    sim.load_params(self.weights_path)
    sim.step(input_feeds={self.input_node: input_data})
    output = sim.data[self.output_p]
    sim.close()
    return output

if name == ‘main’:

from keras.datasets import mnist
from keras.utils import np_utils
from sklearn.metrics import accuracy_score

(X_train, y_train), (X_test, y_test) = mnist.load_data()
# data pre-processing
X_train = X_train.reshape(X_train.shape[0], -1) / 255.  # normalize
X_test = X_test.reshape(X_test.shape[0], -1) / 255.  # normalize
y_train = np_utils.to_categorical(y_train, nb_classes=10)
y_test = np_utils.to_categorical(y_test, nb_classes=10)

X_train_ = np.expand_dims(X_train, axis=1)
X_test_ = np.expand_dims(X_test, axis=1)
y_train_ = np.expand_dims(y_train, axis=1)
y_test_ = np.expand_dims(y_test, axis=1)


model = Spiking_Qnetwork(input_shape=28*28,
                         output_shape=10,
                         nb_hidden=1000,
                         weights_path="/home/huangbo/SpikingDeepRLControl/huangbo_ws/"
                                      "networks/saved_weights/snn_weights")

# training
model.training(input_data=X_train_, label=y_train_, total_nb_dataset=60000, batch_size=600, nb_epochs=50)

output = model.predict(batch_size=X_test.shape[0], input_data=X_test_)
prediction = np.squeeze(output, axis=1)

# evaluate the model
from sklearn.metrics import accuracy_score
acc = accuracy_score(np.argmax(y_test, axis=1), np.argmax(prediction, axis=1))
print "the test acc is:", acc

#8

The normal NEF training time will definitely be much faster than gradient descent techniques; that is the main advantage of the NEF optimization, that it is very quick to compute. However, 1000 epochs is almost certainly more than you would need, something on the order of 10 should be enough for MNIST (you are doing the NEF optimization with 1 epoch).

That is correct.

You need to feed the input into the Node, not the Ensemble (the same idea as when you are doing the training). It looks like you are doing that in the code you posted though, in which case that should be correct. One thing to keep in mind is that the loss you are training on is mean squared error, not classification accuracy, so the values you see during training will be different than the test error you’re computing at the end.

I would actually be kind of surprised if that training was working (a single hidden layer with random encoders is much simpler than most mnist networks successfully trained via gradient descent). So my first guess would be that the training isn’t actually succeeding, which is why your prediction performance is poor. However, if you are seeing MSE values that do correspond to good performance after training, then there could be something wrong with the weight saving/loading. You could try just doing your prediction right after the sim.train (rather than saving the weights and then loading them in predict), just to see if that is what is causing the problem.


#9

Hello drasmuss,

Thanks for the help!
First of all, our reinforcement learning algorithm now work with normal nengo simulator. But I am really interested in nengo_dl, so I still want to fix the problem.
I made the changes you mentioned, I use a Node to input data for both prediction and training.
And I also test not save the parameter but directly do the prediction. But unfortunately it still does not work.

Here are the results:

----------------------------------------------#

Building networkBuilding completed in 0:00:00
Optimizing graphOptimization completed in 0:00:00
Construction completed in 0:00:00
Using TensorFlow backend.
[##############################] ETA: 0:00:00 (Training)
Training completed in 0:03:23
Simulation startedSimulation completed in 0:00:00
the test acc is: 0.1

And here are the code:

----------------------------------------------#

import nengo
import nengo_dl
import numpy as np
import tensorflow as tf

with nengo.Network(seed=0) as model:

model.config[nengo.Ensemble].neuron_type = nengo_dl.neurons.SoftLIFRate()
model.config[nengo.Ensemble].gain = nengo.dists.Choice([1])
model.config[nengo.Ensemble].bias = nengo.dists.Uniform(-1, 1)
model.config[nengo.Connection].synapse = None

# initialize encoder and decoder
rng = np.random.RandomState(1)
encoders = rng.normal(size=(1000, 784))
decoders = rng.normal(size=(1000, 10))

# network
input_node = nengo.Node(size_in=784)
layer = nengo.Ensemble(n_neurons=1000,
                       dimensions=784,
                       encoders=encoders,
                       )
nengo.Connection(input_node, layer)
output = nengo.Node(size_in=10)
output_p = nengo.Probe(output)
conn = nengo.Connection(layer.neurons,
                        output,
                        transform=decoders.T
                        )

with nengo_dl.Simulator(model, minibatch_size=60, step_blocks=1, device="/gpu:0", seed=2) as sim:

from keras.datasets import mnist
from keras.utils import np_utils

(X_train, y_train), (X_test, y_test) = mnist.load_data()
# data pre-processing
X_train = X_train.reshape(X_train.shape[0], -1) / 255.  # normalize
X_test = X_test.reshape(X_test.shape[0], -1) / 255.  # normalize
y_train = np_utils.to_categorical(y_train, nb_classes=10)
y_test = np_utils.to_categorical(y_test, nb_classes=10)

X_train_ = np.expand_dims(X_train, axis=1)
X_test_ = np.expand_dims(X_test, axis=1)
y_train_ = np.expand_dims(y_train, axis=1)
y_test_ = np.expand_dims(y_test, axis=1)

sim.train({input_node: X_train_},
          {output_p: y_train_},
          tf.train.MomentumOptimizer(5e-2, 0.9),
          n_epochs=10
          )


sim.step(input_feeds={input_node: X_test_[0:60,:,:]})
output = sim.data[output_p]
prediction = np.squeeze(output, axis=1)

# evaluate the model
from sklearn.metrics import accuracy_score
acc = accuracy_score(np.argmax(y_test[0:60,:], axis=1), np.argmax(prediction, axis=1))
print "the test acc is:", acc

Further question:
1, Do we always create a new simulator when the batch_size changes?
When we create the nengo_dl simulator, the batch_size has to be set up. For training we may use bigger batch_size, but for prediction, we just input a single image into the network, so we need set the batch_size to 1.

2, Is the way I do prediction right?
#-------------------------------
sim.step(input_feeds={input_node: X_test_[1,:,:]})
output = sim.data[output_p]
#-------------------------------
I test the save and load parameters by using the loss: print sim.loss({input_node: X_test_}, {output_p: y_test_}, “mse”).
After training, I create a new simulator, and first load the parameter, the loss also be 0.1.
If I don’t load the parameters, the loss will be very big.
#------------------------------------------------------#
0.10000000149
Building completed in 0:00:00
Optimization completed in 0:00:00
Construction completed in 0:00:00
#------------------------------------------------------#

So I think the training actually works, so I guess the problem is the way I do prediction, because this is the only differences between what I have done before with normal nengo simulator.


#10

An MSE of 0.1 probably means that the training was not very successful. E.g., if your network just outputs [0 0 0 0 0 0 0 0 0 0] for every training example, then the MSE will be 0.1. So the problem is in the training, not the prediction. Also note that using MSE as the loss function when performing classification is usually not recommended, for exactly this reason (an apparently low error can actually correspond to poor classification). You’d probably get better results using cross entropy. You can read more about this here.

However, as I mentioned, training a one-layer network with random encoders to perform MNIST classification is still probably not going to work great. If that is what you want to do, you’re better off using the NEF optimization, because it is optimal for a single layer optimization like that. If you want to see the advantages of deep learning methods, you’ll probably need to use a more complex, multi-layer system.


#11

Yes, you have to rebuild the simulator to change the batch size (since it is built into the graph, for performance reasons). However, if you don’t want to rebuild the simulator you can just pass in n images, and ignore the ones you’re not interested in (e.g., use sim.data[output_p][0] to just look at the results for the first image).


#12

Hello drasmuss,

I made the changes you mentioned, I add 2 more layers and change the loss_function to cross entropy. But it still does not works. And I also test different optimizer, the result is still bad.
The output are all 0, and it seems the training does not happens, and I have already set everything to trainable.
#---------------------------------------------------
cross_entropy as loss is: 2.30258536339
the test acc is: 0.098
#---------------------------------------------------

import nengo
import nengo_dl
import numpy as np
import tensorflow as tf

def cross_entropy(prediction, label):
return tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=label, logits=prediction))

with nengo.Network(seed=0) as model:

nengo_dl.configure_trainable(model, default=True)

model.config[nengo.Ensemble].neuron_type = nengo_dl.neurons.SoftLIFRate()
model.config[nengo.Ensemble].gain = nengo.dists.Choice([1])
model.config[nengo.Ensemble].bias = nengo.dists.Uniform(-1, 1)
model.config[nengo.Ensemble].trainable = True
model.config[nengo.Connection].trainable = True
model.config[nengo.Connection].synapse = None

# initialize encoder and decoder
rng = np.random.RandomState(1)
encoders = rng.normal(size=(1000, 784))
decoders = rng.normal(size=(1000, 10))

# network
input_node = nengo.Node(size_in=784)
layer_1 = nengo.Ensemble(n_neurons=1000,
                         dimensions=784,
                         encoders=encoders
                         )
layer_2 = nengo.Ensemble(n_neurons=1000,
                         dimensions=784
                         )
layer_3 = nengo.Ensemble(n_neurons=1000,
                         dimensions=784,
                         )
output = nengo.Node(size_in=10)

conn_1 = nengo.Connection(input_node, layer_1)
conn_2 = nengo.Connection(layer_1, layer_2)
conn_3 = nengo.Connection(layer_2, layer_3)
conn_4 = nengo.Connection(layer_3.neurons, output, transform=decoders.T)

output_p = nengo.Probe(output)

with nengo_dl.Simulator(model, minibatch_size=60, step_blocks=1, device="/gpu:0", seed=2) as sim:

from keras.datasets import mnist
from keras.utils import np_utils


(X_train, y_train), (X_test, y_test) = mnist.load_data()
# data pre-processing
X_train = X_train.reshape(X_train.shape[0], -1) / 255.  # normalize
X_test = X_test.reshape(X_test.shape[0], -1) / 255.  # normalize
y_train = np_utils.to_categorical(y_train, nb_classes=10)
y_test = np_utils.to_categorical(y_test, nb_classes=10)

X_train_ = np.expand_dims(X_train, axis=1)
X_test_ = np.expand_dims(X_test, axis=1)
y_train_ = np.expand_dims(y_train, axis=1)
y_test_ = np.expand_dims(y_test, axis=1)

sim.train({input_node: X_train_},
          {output_p: y_train_},
          #tf.train.MomentumOptimizer(5e-2, 0.9),
          tf.train.GradientDescentOptimizer(learning_rate=0.05),
          n_epochs=1,
          objective =cross_entropy
          )

print "cross_entropy as loss is:", sim.loss({input_node: X_test_}, {output_p: y_test_}, cross_entropy)
sim.save_params("/home/huangbo/SpikingDeepRLControl/huangbo_ws/networks/saved_weights/snn_weights")

with nengo_dl.Simulator(model, minibatch_size=10000, step_blocks=1, device="/gpu:0", seed=1) as sim:
sim.load_params("/home/huangbo/SpikingDeepRLControl/huangbo_ws/networks/saved_weights/snn_weights")

sim.step(input_feeds={input_node: X_test_})
output = sim.data[output_p]
prediction = np.squeeze(output, axis=1)

# evaluate the model
from sklearn.metrics import accuracy_score
acc = accuracy_score(np.argmax(y_test, axis=1), np.argmax(prediction, axis=1))
print "the test acc is:", acc

#13

You aren’t doing anything wrong, you’re just running into the fact that training deep networks is complicated. There are a lot of hyperparameters to play around with (number of layers, size of each layer, convolutional/fully connected layers, pooling, neuron parameters, learning rates, weight initialization, etc.). You need to have all those parts working well together, or the training won’t succeed (and it’ll settle on some low-energy solution like all-zero output, as you’re seeing).

My advice would be to start with an architecture that you know can succeed (there are a lot of MNIST tutorials out there that should give you a place to get started). Then make sure that when you recreate that architecture within your framework, it still works. Then start introducing new elements, like SoftLIFRate neurons. If things stop working, make small adjustments to the architecture to try to find some settings that do work.


#14

Also, I’ve been working on some new features this week that should make it easier to build different kinds of networks in nengo_dl, which includes an MNIST example. It should be finished early next week.


#15

Hello drasmuss,

Thanks for you work, I will follow your update.
When you finish the new features, please let us know.
And I will try your suggestions.

best greetings.


#16

You can see the example here. Note that that example includes some features that weren’t in the latest official release, so you will need to do a developer installation.


#17

Hello drasmuss,

Thank for your example of mnist. It really helps, it not only solves the mnist problem, but also make me more clear about how to use nengo_dl.
And now I know why my early training failed. I think the problem is the way I define the input Node.
If I just define the input node like:
input_node = nengo.Node(size_in = self.input_shape)
It seems that data are not feed into the network, I need use nengo.processes.PresentInput.
After I change it to:
input = nengo.Node(nengo.processes.PresentInput(np.zeros(shape=(1, self.input_shape)), 0.1))
The problem is solved.

Thanks again for your work.


#18

Ah yes, that’s an example of the same problem as we identified previously, where if you try to feed input into an invalid Node it just silently fails instead of giving an error message. I’ll fix that tomorrow!


#19

Hello dear drasmuss,

I don’t know do not use nengo_dl in the right way or any reason, I found the memory leak when I use nengo_dl.
For our reinforcement learning project, we will collect the state and action Paars, but our code die after collect 800 data, and we checked the memory, it is fulled.
and you can also see from this mnist example. After we load the mnist_data, the memory should not increased that much during the training. Here is the code, can you please test it:

import nengo
import nengo_dl
import numpy as np
import tensorflow as tf
from copy import deepcopy

class Deep_qNetwork_snn:
    '''
    Q_function approximation using Spiking neural network
    '''
    def __init__(self, input_shape, output_shape, save_path):
        '''
        :param input_shape: the input shape of network, a number of integer 
        :param output_shape: the output shape of network, a number of integer 
        :param save_path: the path to save network parameters, in the prediction, network will load the weights in 
                this path.
                example: '/home/huangbo/Desktop/weights/mnist_parameters'
        '''
        self.input_shape = input_shape
        self.output_shape = output_shape
        self.save_path = save_path

        self.softlif_neurons = nengo_dl.SoftLIFRate(tau_rc=0.02, tau_ref=0.002, sigma=0.002)
        self.ens_params = dict(max_rates=nengo.dists.Choice([100]), intercepts=nengo.dists.Choice([0]))
        self.amplitude = 0.01

    def build_network(self):
        # input_node
        input = nengo.Node(nengo.processes.PresentInput(np.zeros(shape=(1, self.input_shape)), 0.1))

        # layer_1
        x = nengo_dl.tensor_layer(input, tf.layers.dense, units=100)
        x = nengo_dl.tensor_layer(x, self.softlif_neurons, **self.ens_params)

        # layer_2
        x = nengo_dl.tensor_layer(x, tf.layers.dense, transform=self.amplitude, units=100)
        x = nengo_dl.tensor_layer(x, self.softlif_neurons, **self.ens_params)

        # output
        x = nengo_dl.tensor_layer(x, tf.layers.dense, units=self.output_shape)
        return input, x

    def choose_optimizer(self, opt, learning_rate=1):
        if opt == "adam":
            optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate)
        elif opt =='adadelta':
            optimizer = tf.train.AdadeltaOptimizer(learning_rate=learning_rate)
        elif opt == "rms":
            optimizer = tf.train.RMSPropOptimizer(learning_rate=learning_rate)
        elif opt == "sgd":
            optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate)
        return optimizer


    def objective(self, x, y):
        return tf.nn.softmax_cross_entropy_with_logits(logits=x, labels=y)

    def training(self, minibatch_size, train_whole_dataset, train_whole_labels, num_epochs):
        '''
        Training the network, objective will be the loss function, default is 'mse', but you can alse define your
        own loss function, weights will be saved after the training. 
        :param minibatch_size: the batch size for training. 
        :param train_whole_dataset: whole training dataset, the nengo_dl will take minibatch from this dataset
        :param train_whole_labels: whole training labels
        :param num_epochs: how many epoch to train the whole dataset
        :param pre_train_weights: if we want to fine-tuning the network, load weights before training
        :return: None
        '''

        with nengo.Network(seed=0) as model:
            nengo_dl.configure_trainable(model, default=True)
            input, output = self.build_network()
            out_p = nengo.Probe(output)

            train_inputs = {input: train_whole_dataset}
            train_targets = {out_p: train_whole_labels}

        with nengo_dl.Simulator(model, minibatch_size=minibatch_size) as sim:

            if self.save_path is not None:
                try :
                    sim.load_params(self.save_path)
                except:
                    pass

            optimizer = self.choose_optimizer('adadelta', 1)
            # construct the simulator
            sim.train(train_inputs, train_targets, optimizer, n_epochs=num_epochs, objective='mse')
            # save the parameters to file
            sim.save_params(self.save_path)

    def predict(self, prediction_input, minibatch_size=1, load_weights=False):
        '''
        prediction of the network
        :param prediction_input: a input data shape = (minibatch_size, 1, input_shape)
        :param minibatch_size: minibatch size, default = 1
        :return: prediction with shape = (minibatch_size, output_shape)
        '''

        with nengo.Network(seed=0) as model:
            nengo_dl.configure_trainable(model, default=False)
            input, output = self.build_network()
            out_p = nengo.Probe(output)

        with nengo_dl.Simulator(model, minibatch_size=minibatch_size) as sim:

            if load_weights == True:
                try:
                    sim.load_params(self.save_path)
                except:
                    pass

                input_data = {input: prediction_input}
                sim.step(input_feeds = input_data)
                output = np.squeeze(sim.data[out_p], axis=1)

            return deepcopy(output)




if __name__ == '__main__':

    import matplotlib.pyplot as plt
    from tensorflow.examples.tutorials.mnist import input_data
    from sklearn.metrics import accuracy_score

    mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)

    X_test = mnist.test.images
    y_test = mnist.test.labels

    deep_qNetwork = Deep_qNetwork_snn(input_shape=784,
                                      output_shape=10,
                                      save_path='/home/huangbo/Desktop/weights/mnist_parameters'
                                      )

    for i in range(10):
        deep_qNetwork.training(minibatch_size=32,
                               train_whole_dataset = mnist.train.images[:, None, :],
                               train_whole_labels = mnist.train.labels[:, None, :],
                               num_epochs = 1
                               )

        test_input = X_test[:, None, :]
        prediction = deep_qNetwork.predict(prediction_input=test_input, minibatch_size=10000, load_weights=True)
        acc = accuracy_score(np.argmax(y_test, axis=1), np.argmax(prediction, axis=1))
        print "the test acc is:", acc

#20

There does indeed seem to be a memory leak there. However, it seems to be related to TensorFlow (it isn’t releasing all the memory when a Session is closed), so probably isn’t something we can fix easily. I’ll keep looking into it though, and raise an issue with the TensorFlow devs if needed.

In the meantime, you can avoid the problem by not closing and reopening the session every time you call predict/training. For example, just do self.my_predict_sim = nengo_dl.Simulator(...) once, and then you can use self.my_predict_sim.load_params(...) or self.my_predict_sim.step(...) within your prediction function.