100% RAM usage while predicting

ntouev · February 2, 2022, 12:05pm

Hello all,

I converted a keras model and tried some predictions. What I noticed is that, after a couple of tries, RAM usage hits 100%. Even after the end of prediction the usage is not reduced at all. Any ideas on how to prevent this from happening?

My PC has 8 GB RAM and no GPU.

Thank you

xchoo · February 3, 2022, 4:45am

Hi @ntouev,

Without looking at your code, or a code snippet, it’s hard to tell exactly what is causing the memory issues (memory usage is affected by a lot of different factors). That said, I’ll try to make some educated guesses and see if I can’t help you with your problem.

Since you are running a Keras model, I’ll assume you are using TensorFlow and/or NengoDL. If you are running into memory issues just using TensorFlow itself, then the way to get around it is to reduce the number of parameters in your network (i.e., reduce the number of neurons, etc.), and/or reduce the batch size of your training / prediction runs.

If you are using NengoDL, and you are running into memory issues during the training process, once again, the way to get around the memory issue is to reduce the number of parameters in your network and/or reduce the batch size. NengoDL uses TensorFlow (it calls out to the TensorFlow package in the background) to do the training of the model, so the memory usage in TensorFlow and NengoDL should be similar.

If you are using NengoDL and the memory issues arise only during the prediction stage, the it might have something to do with Nengo itself. I believe that NengoDL will keep around probe data during the prediction stage (I have to double check this), and keeping a history of probed data will increase the RAM usage. By default, probes record data every timestep, so you can try to reduce the memory usage by setting the sample_every parameter on the probes to reduce the frequency of sampling by the probe (I’ll need to check if this works in NengoDL, but it would be better to have some code from you to test as well).

ntouev · February 3, 2022, 9:39am

Hello @xchoo ,

thank you for your fast and detailed response.

Below, you can see my code. It is a simple conversion and afterwards a prediction of a single input. Using tensorflow I did not face any issue. In the added code there may be parts, or functions that will not make any sense without the whole repo, but these should not affect the memory issue.

The initial keras_model is a segmentation CNN with input (224,224,3) frames and output (112,112,2), meaning 112x112 image with 2 classes after segmentation.

nengo_model = nengo_dl.Converter(model=keras_model)
assert nengo_model.verify()

nengo_input = nengo_model.inputs[keras_model.inputs]
nengo_output = nengo_model.outputs[keras_model.outputs]

inp = cv2.imread('./test_frames/00005.png', 1)
ann = cv2.imread('./test_annotations/00005.png', 1)
################## predict ################## 
x = get_image_array(inp, width=224, height=224, ordering="channels_last")
x = x.reshape(1, 1, 224*224*3)
n_steps = 1
x = np.tile(x, (1, n_steps, 1))

with nengo_dl.Simulator(nengo_model.net, minibatch_size=1, progress_bar=False) as nengo_sim:
    data = nengo_sim.predict(x)

pr = data[nengo_output].reshape(n_steps,112,112,2)
pr = pr[0,:,:,:]
pr = pr.argmax(axis=2) 

################## evaluate ##################
gt = get_segmentation_array(ann, nClasses=2, width=112, height=112,
                             no_reshape=True, read_image_type=1)
gt = gt.argmax(axis=2) 

accuracy = (pr==gt).mean()
print('Accuracy =', accuracy)

xchoo · February 8, 2022, 3:48am

Hi @ntouev,

If you are only experiencing memory issues with NengoDL, then it may be NengoDL that is using up your memory. Here’s some things to try and pin down exactly what code is causing the memory issues.

Comment out all of the code after and including the with nengo_dl.Simulator line. Does your code encounter memory issues? If it does, then it’s likely that the NengoDL predict function isn’t the cause of the memory issue.
If you don’t get any memory issues with step 1, then uncomment the with nengo_dl.Simulator context block, but nothing else. If your code does run into memory issues, then it has to do with the simulator object creation or the predict function call.
To test if the simulator creation is causing the memory issue, use this code to create the nengo_dl Simulator object (you can replace the nengo_dl.Simulator context block with this):

sim = nengo_dl.Simulator(...)

To test if the predict function call is the cause of the memory issue, try creating the NengoDL simulator object, then use TensorFlow’s native predict function with the NengoDL simulator’s internal Keras model:

sim = nengo_dl.Simulator(...)
tf.predict(sim.keras_model)

If you can do these experiements and identify what part of the code is causing the issue, it’ll give us a better idea of what’s going on, and possible remedies.