Hmmm, I see that you have a bit of a predicament. Iāll try an elaborate on some of your options below:
NengoOCL
NengoOCL doesnāt limit the number of Nengo model you can run on one GPU. However, from my experience, running multiple Nengo models on a single GPU doesnāt improve your throughput at all. Rather, because of the additional I/O needed, you get no gains in throughput (i.e., running 2 models on one GPU halves the speed of both simulations, resulting in a net gain of 0).
Thus, you really only have one option here, and that is to run your code on multiple GPUs. Iām not sure how to do it on Google Colab, but to get an improvement in your throughput, youāll need to spawn multiple instances on your Jupyter notebook on separate GPUs.
@Eric is the primary author of NengoOCL, and he may have some suggestions on how to speed up your network with NengoOCL.
SpiNNaker
NengoSpinnaker is designed to run Nengo simulations in real time, so, doing some quick math, one simulation run with 70,000 images (at 350ms each) will take roughly 7 hours to complete. Thatās slightly more manageable, but 100 runs will still take roughly a month to finish collecting all of the data. You can improve the throughput by running NengoSpinnaker simulations in parallel, but because one SpiNNaker board supports only one NengoSpinnaker model, youāll need access to multiple SpiNNaker boards to accomplish this.
Your Model
I suspect the reason why your model is taking a while to finish running is because of the online learning rule that is incorporated into the model. We do have other tools (e.g., NengoDL) that can perform the learning in an offline manner, but will train the model much quicker. Iām not sure what your goals are with your network are, but it may be worthwhile investigating NengoDL as well (see this example for an MNIST example).