[Nengo-DL]: Prediction with batches of test data

zerone · July 16, 2020, 1:47am

Hello @Brent and @drasmuss, thank you for your inputs. steps parameter is supposed to be passed to TF predict function to run the test data generator steps number of times.

I can see the GPU being used for training and inference both. Earlier I could only see the RAM being increased, which led me to thinking that only CPU is being used. I was able to do the batch testing, but with a different syntax. Mentioning it below in case others find it useful.

def get_test_batches(batch_size):
  for i in range(0, test_images.shape[0], batch_size):
    ip = test_images[i:i+batch_size]
    yield ip

with nengo_dl.Simulator(converter.net, minibatch_size=batch_size) as nengo_sim:
  nengo_sim.load_params("./keras_to_snn_params")
  
  test_data = get_test_batches(batch_size)
  pred_labels = []
  for data in test_data:
    tiled_data = np.tile(data, (1, n_steps, 1))
    pred_data = nengo_sim.predict_on_batch({nengo_input: tiled_data})
    for row in pred_data[nengo_output]:
      pred_labels.append(np.argmax(row[-1]))

Note that batch_size passed to get_test_batches and assigned to minibatch_size should be same, else it throws error.

I am now stuck at regularization of firing rates while batch training. I have implemented the following code on similar lines as in the tutorial with a difference of loss function and using layer names (strings) in place of nengo node objects.

tf_keras_model, inpt, otpt, conv0, conv1 = create_2d_cnn_model((28, 28, 1))
converter = nengo_dl.Converter(tf_keras_model)

print(converter.net.all_nodes)
with converter.net:
    output_p = converter.outputs[otpt]
    conv0_p = nengo.Probe(converter.layers[conv0])
    conv1_p = nengo.Probe(converter.layers[conv1])

batch_size, target_rate = 500, 250
def get_batches(batch_size):
  for i in range(0, train_images.shape[0], batch_size):
    ip = train_images[i:i+batch_size]
    label = train_labels[i:i+batch_size]
    yield ({
        "input_1": ip,
        "n_steps": np.ones((batch_size, 1), dtype=np.int32),
        "conv2d.0.bias": np.ones((batch_size, 32, 1), dtype=np.int32),
        "conv2d_1.0.bias": np.ones((batch_size, 64, 1), dtype=np.int32),
        "dense.0.bias": np.ones((batch_size, 10, 1), dtype=np.int32)
    },
    {
        "dense.0": label,
        "conv2d.0": np.ones((train_labels.shape[0], 1, conv0_p.size_in)) * target_rate,
        "conv2d_1.0": np.ones((train_labels.shape[0], 1, conv1_p.size_in)) * target_rate,
    })

with converter.net:
    output_p = converter.outputs[otpt]
    conv0_p = nengo.Probe(converter.layers[conv0])
    conv1_p = nengo.Probe(converter.layers[conv1])

with nengo_dl.Simulator(converter.net, minibatch_size=batch_size) as sim:
  sim.compile(
        optimizer=tf.keras.optimizers.Adam(lr=1e-3),
        loss={
            output_p: tf.keras.losses.CategoricalCrossentropy(from_logits=True),
            conv0_p: tf.losses.mse,
            conv1_p: tf.losses.mse,
        },
    
        loss_weights={output_p: 1, conv0_p: 1e-3, conv1_p: 1e-3}
    )
  
  for epoch in range(10):
    data_generator = get_batches(batch_size)
    
    sim.fit(
        data_generator, epochs=1, steps_per_epoch=120)
    
    
  sim.save_params("./keras_to_snn_params_regularized")

It first of all throws some warning and then gets stuck at some step which leads to memory explosion on my system. I can see usage of all 16GB RAM + 30 GB Swap and then it ultimately crashes. During the entire event there is no computing activity on GPU. Even after replacing the string layer names with neon node objects, the story remains the same.

I am not sure what’s wrong. Please let me know. I will be happy to share the entire ready to execute script if someone needs it. Thanks!