It appears as though nengo-mpi currently supports
nengo==2.1. What work has been done in the space with respect to:
- support for newer versions of Nengo
- performance on difficult benchmarks such as SPAUN and version 2.0 with its 6.6 million neurons
- comparisons to other backends (FPGA, GPU, Loihi, etc.) in terms of accuracy, speed, and power consumption
- papers that mention this backend
Relatedly I noticed that @aditya_gilra was looking for support over 2 years ago. Was progress made here, and what for?
Thanks @arvoelke for bringing this up! Someone whom I’m collaborating with is also interested in nengo-mpi for learning in very large models.
My current usecase / understanding for nengo-mpi is as follows. nengo_ocl on an Nvidia 1070Ti is 20-25x faster than nengo on say a 32-core server (my earlier observations recalled top of my head). However, nengo_ocl can’t make use of multiple GPUs. And usual nengo can’t make use of multiple nodes / servers. Assuming linear scaling, if I had more than 25 32-core servers, I would want to make use of nengo-mpi. Another use case would be if my model was too large to fit into one GPU’s RAM.
Do you have any other workarounds for large models? Or nengo-mpi is the only way to go? What about nengo-dl? Can it make use of multiple GPUs?
nengo-dl is, at its core, a TensorFlow simulation, so it should be possible to take advantage of multiple GPUs (since TensorFlow supports that). That being said, I have never actually tried it (it’s on the TODO list), so I would be surprised if there weren’t some gotchas in there that need to be worked around.