NengoDL MNIST tutorial: trained model weights

Choozi · April 27, 2021, 2:52am

Hello Nengo community,

I would like to extract the weights of the trained model. I have questions related to the save_params command in the NengoDl MNIST tutorial. Does this command save the weights of the model?

For instance, the network below it gives 5 NumPy arrays in the save_params file.

with nengo.Network(seed=0) as net:
    net.config[nengo.Ensemble].max_rates = nengo.dists.Choice([100])
    net.config[nengo.Ensemble].intercepts = nengo.dists.Choice([0])
    net.config[nengo.Connection].synapse = None
    neuron_type = nengo.LIF(amplitude=0.01)
    nengo_dl.configure_settings(stateful=False)
    inp = nengo.Node(np.zeros(28 * 28))  
    x = nengo_dl.Layer(tf.keras.layers.Dense(4))(inp, shape_in=(28* 28, 1))
    x = nengo_dl.Layer(neuron_type)(x)
    out = nengo_dl.Layer(tf.keras.layers.Dense(units=4))(x)
    out_p = nengo.Probe(out, label="out_p")
    out_p_filt = nengo.Probe(out, synapse=0.1, label="out_p_filt")

When I unzip the save param file it gives a warning “headers error”. But it successfully extracts 5 numpy arrays. The shape of the arrays saved by save_params are: (3136,), (1, 4), (4,), (3136, 4) and (4,) respectively. If my understanding is correct, the first layer has the dimension of 3136 because 28x28 pixels are connected to 4 neurons of the first dense layer. Then dense layer output is (1,4) and (4,) is the output of LIF. Am I right? If yes, then how is the next dimension (3136,4)?

Thank you in advance for your answer.

xchoo · April 27, 2021, 4:08pm

Hi @Choozi, and welcome back!

To answer your questions:

The save_params function saves the parameters of the entire model. This includes the weights, biases, and state information (if specified). You can read the documentation here, or the source code here.

That is correct. If you look at the code for the save_params function, you’ll see that all it is doing is saving the keras_model.weights object to file. The keras_model.weights object itself is a Keras layer attribute, which is documented here (look for the weights attribute). Note that since NengoDL is just using the Keras weights object, NengoDL itself doesn’t have any direct control over the ordering of the parameters that appear in the keras_model.weights object (it’s up to Keras to realize this order).

If you look at the Keras documentation, it states:

The concatenation of the lists trainable_weights and non_trainable_weights (in this order).

Thus, the second (3146,4) shaped set of weights are the non-trainable weights for the first Dense layer. In the case of the Dense layer, they are the kernel weights. You can also see this by printing out the keras_model.weights object:

with nengo_dl.Simulator(net) as sim:
    print(sim.keras_model.weights)

which outputs:

[<tf.Variable 'TensorGraph/base_params/trainable_float32_3136:0' shape=(3136,) dtype=float32, numpy=array([1., 1., 1., ..., 1., 1., 1.], dtype=float32)>, 
<tf.Variable 'TensorGraph/dense/kernel:0' shape=(1, 4) dtype=float32, numpy=
array([[-0.73366153,  0.8796015 ,  0.28695   , -0.14340228]],
      dtype=float32)>, 
<tf.Variable 'TensorGraph/dense/bias:0' shape=(4,) dtype=float32, numpy=array([0., 0., 0., 0.], dtype=float32)>, 
<tf.Variable 'TensorGraph/dense_1/kernel:0' shape=(3136, 4) dtype=float32, numpy=
array([[ 0.00088362, -0.0049368 , -0.00799659,  0.04305665],
       [ 0.01649414, -0.01347676, -0.00558941,  0.00883536],
       [-0.00379217,  0.02209238, -0.02727717,  0.00426263],
       ...,
       [-0.04357701, -0.00389264,  0.03157449,  0.00664571],
       [ 0.0045911 , -0.0011935 , -0.00038958, -0.02644506],
       [-0.04076268, -0.03773354,  0.04180735, -0.02416949]],
      dtype=float32)>, 
<tf.Variable 'TensorGraph/dense_1/bias:0' shape=(4,) dtype=float32, numpy=array([0., 0., 0., 0.], dtype=float32)>]

Choozi · April 30, 2021, 10:06pm

Hello @xchoo thank you very much…!

I am glad to be back with the community.

Thank you for your detailed answer and for sharing useful links. However, I might be wrong… but the model has no non-trainable parameters since I have used only dense layers. The fact that the dense_1/kernel is (3136,4) is only because of the output of the first layer. Since the input to the layer is (784,1), only 1 value goes to the first layer and hence it makes the kernel/weights of the first layer of shape (1,4). This means that the outputs of the layer will be 784x4 = 3136, which are connected to 4 neurons of the dense_1 layer making the kernel/weights of dense_1 (3136,4). So these are not non-trainable parameters. This is my understanding… please correct me if I am wrong.

xchoo · April 30, 2021, 10:22pm

Ah, that is correct, the dense layers to not have trainable weights. I misread the output of the keras_model.weights. I don’t use TensorFlow / NengoDL very often.

Looking at your model again, I think the trainable weights (of shape (3136,)) come from the connection from the first dense layer to the neuron_type layer, rather than between the input and the first dense layer. If you comment out the neuron_type layer, the trainable weights go away.

From the nengo_dl.Layer documentation, if you create a Layer with a Nengo neuron type, it will create an ensemble of neurons, where the number of neurons will be the same as the input dimensionality to that layer. This will also create a connection between the preceding layer to the neural ensemble, and those weights are trainable by default.

Choozi · May 1, 2021, 12:35am

@xchoo it’s ok … no worries

Yes, you are right this has something to do with the neuron_type. But this confuses me more maybe because I am looking at it from conventional ANN in Keras. Don’t these dense layers represent neurons themselves? I see the neuron_type layer as an activation function… just like for example Relu.

Isn’t it just a activation fucntion?

If it is a layer, then does this ‘neuron_type’ layer has their own weights as well?

If yes, then can I say that we have approx. two times training parameters for the same model build in Keras in conventional ANN?

Looking at the documentation

A function or Keras Layer that takes the value from an input (represented as a tf.Tensor ) and maps it to some output value, or a Nengo neuron type (which will be instantiated in a Nengo Ensemble and applied to the input).

Does this mean that the weights of the neuron_type are applied to inputs before passing it through the dense layer? Sorry If I am not making any sense…

Choozi · May 8, 2021, 5:27pm

@xchoo @Eric @zerone can anybody of you please comment on this question?

zerone · May 8, 2021, 6:10pm

Hello @Choozi, I haven’t been following this topic, neither I am super good at it. But it seems like

the above is untrue. In a traditional TF model as well… if you don’t mention any activation straight-ahead in Dense layers, followed by a layer of ReLU (etc.) neurons, then the connection from Dense layer to the layer of ReLU is supposed to be one to one identity connection (i.e. no weights). In fact, the weights are on the input connections to the Dense layer. Let’s wait for other experts to resolve your doubt.

xchoo · May 11, 2021, 3:22am

Hi @Choozi,

To get back to your original question:

Looking at the output of weights again, there is a slight correct to be made:
The first (3136,) set of weights belong to the nengo_dl.Layer(neuron_type)(x) layer. That code creates a layer of 3136 LIF neurons, and each neuron has a trainable weight attached to it.

This set of weights belong to the second dense layer, which is connecting the 3136 neurons in the LIF layer to the 4 neurons in the second dense layer.

To answer your further questions:

No. The nengo_dl.Layer(neuron_type)(x) layer is bunch of activation functions, and a set of trainable weights (one for each neuron).

Yes, this is correct.

This is not correct though. If created with defaults, dense layers have a ReLU activation function, and trainable weights. In NengoDL, you can print out the trainable weights with sim.keras_model.trainable_weights:

with nengo.Network(seed=0) as net:
    inp = nengo.Node(np.zeros(28 * 28))
    x = nengo_dl.Layer(tf.keras.layers.Dense(4))(inp, shape_in=(28 * 28, 1))

with nengo_dl.Simulator(net) as sim:
    sim.keras_model.summary()
    print(sim.keras_model.trainable_weights)

Choozi · May 22, 2021, 1:20am

@xchoo @zerone Thank you for your answers.
Considering your answers and play a bit more with the nengo layers. Considering the following model I guess the structure of the weights would look like this:

with nengo.Network(seed=0) as net:
	# set some default parameters for the neurons that will make
	# the training progress more smoothly
	net.config[nengo.Ensemble].max_rates = nengo.dists.Choice([100])
	net.config[nengo.Ensemble].intercepts = nengo.dists.Choice([0])
	net.config[nengo.Connection].synapse = None
	neuron_type = nengo.LIF(amplitude=0.01)

	# this is an optimization to improve the training speed,
	# since we won't require stateful behaviour in this example
	nengo_dl.configure_settings(stateful=False)

	# the input node that will be used to feed in input images
	inp = nengo.Node(np.zeros(2* 1))
	x = nengo_dl.Layer(tf.keras.layers.Dense(1))(inp, shape_in=(2*1,1))
	x = nengo_dl.Layer(neuron_type)(x)
	x = nengo_dl.Layer(tf.keras.layers.Dense(2))(x)
	x = nengo_dl.Layer(neuron_type)(x)
	x = nengo_dl.Layer(tf.keras.layers.Dense(3))(x)
	x = nengo_dl.Layer(neuron_type)(x)
	out = nengo_dl.Layer(tf.keras.layers.Dense(units=4))(x)
	out_p = nengo.Probe(out, label="out_p")
	out_p_filt = nengo.Probe(out, synapse=0.1, label="out_p_filt")

The weights of this models are:

[<tf.Variable 'TensorGraph/base_params/trainable_float32_7:0' shape=(7,) dtype=float32, numpy=array([1., 1., 1., 1., 1., 1., 1.], dtype=float32)>, <tf.Variable 'TensorGraph/dense/kernel:0' shape=(1, 1) dtype=float32, numpy=array([[-1.1600207]], dtype=float32)>, <tf.Variable 'TensorGraph/dense/bias:0' shape=(1,) dtype=float32, numpy=array([0.], dtype=float32)>, <tf.Variable 'TensorGraph/dense_1/kernel:0' shape=(2, 2) dtype=float32, numpy=
array([[ 0.02475715, -0.13831842],
       [-0.2240473 ,  1.206355  ]], dtype=float32)>, <tf.Variable 'TensorGraph/dense_1/bias:0' shape=(2,) dtype=float32, numpy=array([0., 0.], dtype=float32)>, <tf.Variable 'TensorGraph/dense_2/kernel:0' shape=(2, 3) dtype=float32, numpy=
array([[ 0.72141063,  0.29407883,  0.0322665 ],
       [-0.23862988,  0.1772492 , -0.9892268 ]], dtype=float32)>, <tf.Variable 'TensorGraph/dense_2/bias:0' shape=(3,) dtype=float32, numpy=array([0., 0., 0.], dtype=float32)>, <tf.Variable 'TensorGraph/dense_3/kernel:0' shape=(3, 4) dtype=float32, numpy=
array([[-0.4879959 , -0.48369777,  0.00672376,  0.50649774],
       [ 0.20511496,  0.6891403 , -0.42060506,  0.21907902],
       [-0.55083364,  0.68524706, -0.01406157, -0.00568473]],
      dtype=float32)>, <tf.Variable 'TensorGraph/dense_3/bias:0' shape=(4,) dtype=float32, numpy=array([ 0.05611312,  0.01365218,  0.05257304, -0.10704393], dtype=float32)>]

According to my understanding now:
The first shape=(7,) are the weights of the nueron_type layers used in the whole model and lets calls this array as A and is empty.

For first dense layer: Kernel shape=(1,1) and bias shape = (1,) as only one neuron is used with one weight connection as the network layer is taking one 1 input at a time. When all input (input=2) are passed from dense layer, it is fed to the neuron_type layer who weights (2 weights) are appended A.

Next dense layer is with 2 neurons: so kernal is (2,2) and bias is (2,). The output of this layer will be (2,1) which will be fed to neuron_type layer with 2 weights values and are appended to A.

Next dense layer is with 3 neurons: so kernel is (2,3) and bias ( 3,). The output will be (3,1) which will be fed to neuron_type layer with 3 weights values and are appended to A.

This makes the total no. of weights for the neuron_type =7 as indicated by the first shape (7,).

Does this make any sense? am I correct in interpreting?

If yes, then what these weight values of the neuron_type indicate? is the voltage threshold of each neuron? i.e. if the membrane potential is greater than this threshold it fires?

Thank you for your reply in advance.

xchoo · May 25, 2021, 7:38pm

That’s correct. The weights for all of the neuron_type layers are accumulated into the first trainable_float32_7 array.

Choozi:

For first dense layer: Kernel shape=(1,1) and bias shape = (1,) as only one neuron is used with one weight connection as the network layer is taking one 1 input at a time. When all input (input=2) are passed from dense layer, it is fed to the neuron_type layer who weights (2 weights) are appended A.

Next dense layer is with 2 neurons: so kernal is (2,2) and bias is (2,). The output of this layer will be (2,1) which will be fed to neuron_type layer with 2 weights values and are appended to A.

Next dense layer is with 3 neurons: so kernel is (2,3) and bias ( 3,). The output will be (3,1) which will be fed to neuron_type layer with 3 weights values and are appended to A.

This makes the total no. of weights for the neuron_type =7 as indicated by the first shape (7,).

Does this make any sense? am I correct in interpreting?

That is correct, yes.

No. The weights are just that, connection weights, similar to how the kernel weights for the dense layers.

Choozi · June 28, 2021, 5:40pm

@xchoo thank you for your replies. Seems I still need to understand more the internal of the nenogDL. I will look into more details about it and will get back to you in case I have any questions.
Thank you once again for taking the time to answer all my questions.