Error Running - Optimizing a Spiking Neural Network Example

toddswri · April 20, 2021, 12:40am

Hello, I’m trying to build and run a Jupyter Notebook for optimizing a spiking neural network using the Nengo DL example at: Optimizing a spiking neural network — NengoDL 3.4.1.dev0 docs . I’m getting an error telling me no tensorflow GPU support is detected, but I have the tensorflow-gpu package installed in my python 3.7 environment within Anaconda. Here’s a screenshot:

Could you please give me some direction as to what may be going wrong?
Thank you in advance,
-Todd

toddswri · April 20, 2021, 12:51am

Here is a screenshot of my installed tensorflow packages.

xchoo · April 20, 2021, 1:58pm

Hi @toddswri

The tensorflow-gpu package might not be installing the correct components to get TensorFlow to work with your GPU. The instructions on how to fix this vary with the GPU you are using.
Before going further, what GPU are you using in your computer?

Nvidia Not-Ampere GPU
If your GPU is an Nvidia GPU that is not based on the Ampere architecture (i.e., any 30 series cards), then you can fix your GPU issue by doing:

conda install "cudatoolkit==10.1.243"
conda install cudnn

in your Conda environment.

Nvidia Ampere GPU
If your GPU is an Nvidia GPU that is based on the Ampere architecture, then you can fix your GPU issues by doing:

conda install cudatoolkit
conda install cudnn -c conda-forge

in your Conda environment.

Not Nvidia GPU
If your GPU is not an Nvidia GPU, then unfortunately, TensorFlow does not support your GPU.

toddswri · April 20, 2021, 9:11pm

Thanks Xchoo,

I was using an older HP Elitebook 850 G1 with a HD 4400 (Mobile 1.0/1.1 GHz GPU. So it looks like that is the culprit. I’ll give it a go on my desktop system which is a bit more powerful. But, I’ll still need to check that GPU as well.

Do you folks have any minimum system hardware requirements posted anywhere for review? Or if not, can you please provide a general recommendation for the purchase of a new laptop system?

Thanks again,
-Todd

xchoo · April 21, 2021, 1:23am

If you are looking at laptop GPUs, any Nvidia 10 (e.g. GTX 1080), 20 (e.g., RTX 2080), or 30 (e.g. RTX 3080) series GPU should work. For GPUs, the more important specification is the amount of VRAM they come equipped with. The amount of VRAM they have will limit the size of the model you can train using your GPU. I do not have a rough mapping of the number of neurons to VRAM usage to provide you, however, since it is very much dependent on the network architecture.

One thing I will caution with the newest generation (the 30 series) of GPUs. They have not been tested as rigorously as the older models (since they were released relatively recently), so installing software to get them to work might involve additional steps.

toddswri · April 21, 2021, 2:20am

Thank you. Your suggestions are greatly appreciated. I am predicting a new laptop on my horizon.

toddswri · April 21, 2021, 7:32pm

Well, I thought I almost had everything figured out to get this example running, but I have hit a circular error that I can’t seem to escape. I’m now trying to run the Jupyter notebook example on my work desktop with an NVIDIA Quadro RTX 4000 GPU installed.

I’m trying to install nengo-ocl in order to run the SNN example mentioned at the top of this thread. When I execute “pip install nengo-ocl” I get an error telling me that “nengo-gui 0.4.7 requires nengo<=3.0.0,>=2.6.0, but you have nengo 3.1.0 which is incompatible.”. If I then try to downgrade to nengo-gui 3.0.0 I end up getting an error that says “nengo-ocl 2.1.0 requires nengo>=3.1.0, but you have nengo 3.0.0 which is incompatible.”

I’ve included a screenshot of what’s happening:

Now I’m not sure what to do; any suggestions?
-Todd

toddswri · April 21, 2021, 7:38pm

Here’s the notebook I’m attempting to run:

xchoo · April 21, 2021, 10:38pm

Right! I see your issue. The primary cause of this issue is that NengoGUI hasn’t yet been updated to work with Nengo 3.1.0+. And since we are actively developing a complete overhaul of the NengoGUI, we have suspended any active development in the current NengoGUI.

Fixing your dependency issue, however, isn’t too difficult. In order to install NengoOCL, simply install NengoGUI (which will install the latest compatible version of Nengo for your), then install NengoOCL v2.0.0:

pip install nengo-gui 
pip install "nengo-ocl==2.0.0"

toddswri · April 27, 2021, 10:25pm

Cool! Thank you very much.

toddswri · April 29, 2021, 6:28pm

Sorry to keep adding to this thread, but I seem to be stuck once again trying to run the Optimizing a Spiking Neural Network.

I’ve switched from my laptop to my work desktop that has a NIVDIA Quadro RTX 4000 installed. When I once again run the optimizing a spiking neural network example it’s throwing the same “No GPU support detected.” error I was getting on my less-than-robust laptop (see screenshot).

I thought I might have solved the issue by following the instructions provided earlier in this thread, but I seem to be back to the same problem. From what I can gather, my NVIDIA GPU is based on the Turing GPU architecture. Could that be my problem?

I am out of ideas and stuck, so I thought it was time once again to ask for help.

Thanks.

toddswri · April 29, 2021, 8:03pm

Looks like I’m getting the same GPU error when I attempt to run the Legendre Memory Units in NengoDL example.

Could this possibly be caused by the NIVDIA Quadro RTX 4000 GPU, similar to what is happening to me in the Optimizing a Spiking Neural Network example above?

zerone · April 30, 2021, 12:40am

Hello @toddswri, not sure how much of help I will be, but you should mention the output of nvidia-smi on your terminal. In case the GPU has been properly configured for the usage by TF (or Nengo-DL), the output should be similar to here.

If not, then you need to properly configure it first.

xchoo · April 30, 2021, 2:11am

The error you posted is a TensorFlow error telling you that it is unable to detect an appropriate GPU to use. To figure out what is causing the issue:

Check that the Nvidia drivers are working. As @zerone mentioned, you can do so by using the nvidia-smi in a terminal window.
Check that the GPU is visible to TensorFlow. To do this, start a Python instance (in the environment you want to run your NengoDL model in), then run the following code:

import tensorflow as tf
print(tf.config.list_physical_devices('GPU'))

TensorFlow will print out a bunch of statements (which you should include in your reply) indicating the status of CUDA libraries that are being loaded. If TensorFlow can detect your GPU, you should see a final output like:

[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]

I checked the Nvidia support page, and it seems like the Quadro RTX 4000 does meet the CUDA requirements for the latest CUDA toolkit release, so it should be usable without issue.

Some additional information that would be helpful to have:

What operating system are you using?
What packages are installed in your environment. If you are using Conda, run the command conda list in your activated environment. Otherwise, run pip freeze.

My last thought is that your laptop might be using Nvidia optimus. Nvidia optimus allows the OS to switch between an integrated GPU (i.e., integrated with the CPU package), and the dedicated Quadro chip. If this is the case, Python or TensorFlow might not be picking up your GPU because Nvidia Optimus has selected the integrated GPU to run with your Python environment. If this is the case, it should be apparent when you try to run nvidia-smi.

toddswri · April 30, 2021, 5:32pm

Thank you for the suggestions and help.

Here’s a screenshot of my NVIDIA drivers using nvidia-smi:

And here is my output when I check to see if my GPU is visible:
import tensorflow as tf
print(tf.config.list_physical_devices(‘GPU’))

It appears TensorFlow is not seeing the GPU.

The desktop system I’m using is running Windows 10. Here is a file listing my installed packages in my Python 3.7 environment.
Python_Installed_Packages.pdf (236.4 KB)

I’m embarrassed to say that I have no experience regarding how to configure the NVIDIA card and will need to ask for further instructions.

Thanks again.

zerone · April 30, 2021, 7:58pm

I see that you have tensorflow-gpu 2.4.1 from your Python_Installed_Packages.pdf. My suggestion will be to first get your TF detect your GPU. Once it does, Nengo-DL shouldn’t have any issues in using it. I don’t know much about whether TF 2.4.1 is supported by your current CUDA version 11.0 (as can be seen in the screenshot you attached) or not. So if you are not bound to TF-2.4.1, then may be you can install TF 2.4.0 and check. Here’s one link to help you set up GPU for TF-2.4.0.

You can find more info about Nengo-DL and TF installation here. My environment’s info is mentioned below.

>>> import tensorflow as tf
>>> tf.__version__
'2.2.0'
>>> print(tf.config.list_physical_devices('GPU'))
2021-04-30 15:51:06.032334: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2021-04-30 15:51:06.075846: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties:
pciBusID: 0000:04:00.0 name: Tesla P100-PCIE-12GB computeCapability: 6.0
coreClock: 1.3285GHz coreCount: 56 deviceMemorySize: 11.91GiB deviceMemoryBandwidth: 511.41GiB/s
2021-04-30 15:51:06.085741: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2021-04-30 15:51:06.189458: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2021-04-30 15:51:06.266299: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2021-04-30 15:51:06.373993: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2021-04-30 15:51:06.440953: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2021-04-30 15:51:06.491673: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2021-04-30 15:51:06.601603: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2021-04-30 15:51:06.603795: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0
[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
>>> import nengo_dl
>>> nengo_dl.__version__
'3.4.0'

You may want to use conda to create separate environments and install desired versions of TF, Nengo-DL, etc. libraries. Honestly, setting up TF and GPU is just a Google Search away.

EDIT: Found another related issue on this forum. Might be of some help. You may find some more resolved issues similar to this forum. You can use search bar on the top right corner (besides the burger menu) to search for similar issues.

xchoo · April 30, 2021, 8:23pm

Hi @toddswri,

I’m not entirely sure what’s going on, but I think there may be a mismatch between the TensorFlow version and CUDA toolkit version you have installed in your environment. Fixing an existing Conda environment is hit or miss, so I recommend creating a new environment from scratch. Here are the instructions to get just TensorFlow and GPU support installed in your environment (you can install everything else once you test that the GPU is detected):

Create a new Conda environment
Create a new Conda environment (replace <env_name> with a name you like). Since you were using Python 3.7, I’ve replicated it here:

conda create -n <env_name> python=3.7
conda activate <env_name>

Install CUDA toolkit

conda install cudatoolkit

You should see it install v11.0. If it doesn’t, let me know!

The following NEW packages will be INSTALLED:

  cudatoolkit        pkgs/main/win-64::cudatoolkit-11.0.221-h74a9793_0

Install cuDNN

conda install -c conda-forge cudnn

You should see it install v8.1 (and some other stuff).

The following NEW packages will be INSTALLED:

  cudnn              conda-forge/win-64::cudnn-8.1.0.77-h3e0f4f4_0

Install TensorFlow
Note the use of pip instead of conda here!

pip install tensorflow

You should see it install v2.4.1, and a whole bunch of other packages:

Collecting tensorflow
  Using cached tensorflow-2.4.1-cp37-cp37m-win_amd64.whl (370.7 MB)

Test TensorFlow with GPU
Now comes time to test that TensorFlow can pick up your GPU. Start Python from your Conda terminal:

python

Then do the list_physical_devices test:

import tensorflow as tf
print(tf.config.list_physical_devices("GPU"))

When I did these two steps in a fresh Conda environment on my Windows machine, I see:

>>> import tensorflow as tf
2021-04-30 16:09:19.465296: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudart64_110.dll
>>> print(tf.config.list_physical_devices("GPU"))
2021-04-30 16:09:43.351770: I tensorflow/compiler/jit/xla_cpu_device.cc:41] Not creating XLA devices, tf_xla_enable_xla_devices not set
2021-04-30 16:09:43.352497: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library nvcuda.dll
2021-04-30 16:09:43.374485: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 0 with properties:
pciBusID: 0000:2d:00.0 name: NVIDIA GeForce RTX 3090 computeCapability: 8.6
coreClock: 1.74GHz coreCount: 82 deviceMemorySize: 24.00GiB deviceMemoryBandwidth: 871.81GiB/s
2021-04-30 16:09:43.374629: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudart64_110.dll
2021-04-30 16:09:43.402428: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublas64_11.dll
2021-04-30 16:09:43.402503: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublasLt64_11.dll
2021-04-30 16:09:43.421391: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cufft64_10.dll
2021-04-30 16:09:43.423192: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library curand64_10.dll
2021-04-30 16:09:43.673219: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cusolver64_10.dll
2021-04-30 16:09:43.682491: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cusparse64_11.dll
2021-04-30 16:09:43.683142: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudnn64_8.dll
2021-04-30 16:09:43.683281: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1862] Adding visible gpu devices: 0
[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]

If the test successfully passes, you can then continue to install the other packages (i.e., Nengo, NengoDL, Jupyter, etc.)

Let me know if these instructions work for you!

Note: These instructions work with the latest version of NengoDL. If you want to use NengoGUI, it doesn’t support the latest version of NengoDL, so the instructions will change slightly (see the link that @zerone posted)

toddswri · May 7, 2021, 9:16pm

Sorry it’s taken me so long to reply. My mom went in the hospital this week and my schedule has been erratic to say the least.

The instructions you provided worked and it now appears that the GPU is being seen by TensorFlow in the newly created environment. I attached a screenshot of the output returned by the “print(tf.config.list_physical_devices(“GPU”))” command that seems to match what you posted above.

The only last question I now have has to do with the pip error message that was thrown about nengo-dl 3.4.0 requiring nengo 3.0.0 which I have not yet installed into this new environment. What version of nengo should I install so as not to have any incompatibilities with nengo-gui 0.4.7 or nengo-ocl 2.0.0? Should it be nengo 3.0.0? This question goes back to the issue I was having back on April 21st (16 days ago) when you told me to install nengo-gui with nengo-ocl 2.0.0.

As always, thanks again for your help. Your instructions really helps our team a lot. Have a great weekend!

-Todd

xchoo · May 11, 2021, 4:32am

Hi @toddswri,

Getting NengoOCL to work alongside NengoDL and NengoGUI is possible, and here are the installation instructions to do so. I’m going to start the instructions by creating a new environment because I don’t know what state your existing environment is in (so the easiest is to start with a blank slate):

Create a new Conda environment
Create a new Conda environment (replace <env_name> with a name you like). Since you were using Python 3.7, I’ve replicated it here:

conda create -n <env_name> python=3.7
conda activate <env_name>

Install the various GPU libraries

conda install cudatoolkit
conda install -c conda-forge cudnn pyopencl

Along with some other packages, the following versions for cudatoolkit, cudnn and pyopencl were installed for me:

cudatoolkit        conda-forge/win-64::cudatoolkit-11.2.2-h933977f_8
cudnn              conda-forge/win-64::cudnn-8.1.0.77-h3e0f4f4_0
pyopencl           conda-forge/win-64::pyopencl-2021.1.6-py37hb605e8c_0

Install TensorFlow
Note the use of pip instead of conda here!

pip install tensorflow

You should see it install v2.4.1, and a whole bunch of other packages:

Collecting tensorflow
  Using cached tensorflow-2.4.1-cp37-cp37m-win_amd64.whl (370.7 MB)

Install Nengo and related Nengo packages
You can now install Nengo and such. The order doesn’t matter, but note that the NengOCL version has been fixed to 2.0.0.

pip install nengo nengo-gui nengo-dl "nengo-ocl==2.0.0"

For my environment, it installed these versions (which should all work together). You might see two listings for nengo, but the second one should be v3.0.0.

Collecting nengo-dl
  Using cached nengo_dl-3.4.0-py3-none-any.whl
Collecting nengo-gui
  Using cached nengo_gui-0.4.7-py3-none-any.whl (843 kB)
Collecting nengo-ocl==2.0.0
  Using cached nengo_ocl-2.0.0-py3-none-any.whl (77 kB)
Collecting nengo
  Using cached nengo-3.0.0-py3-none-any.whl (391 kB)

Testing the GPU
You can now proceed to test the GPU in Python. For TensorFlow (as before) do:

import tensorflow as tf
print(tf.config.list_physical_devices("GPU"))

For PyOpenCL (which NengoOCL uses), do:

import pyopencl as cl
print(cl.get_platforms())

You should see a list of available platforms to use. Sometimes it detects your CPU as an available platform to run OCL code on, so it will show up in this list. For my machine, only one entry shows up in the list:

[<pyopencl.Platform 'NVIDIA CUDA' at 0x19b412798c0>]

Next, you can print the available devices for the platform you want to use. For my machine, since there is only one available OCL platform, I use the list index 0, but your machine might be different:

print(cl.get_platforms()[0].get_devices())

With that, I see this:

[<pyopencl.Device 'NVIDIA GeForce RTX 3090' on 'NVIDIA CUDA' at 0x19b412799b0>]

indicating I have an RTX 3090 visible to PyOpenCL to use.

toddswri · May 11, 2021, 9:06pm

Okay. I followed your last post to get my NengoOCL working along with NengoDL and NengoGUI. Everything seemed go well until I got to the part where I install Tensorflow. If I’m reading the output correctly there seems to be a couple of libraries that did not load properly during the process:

tensorflow/compiler/jit/xla_cpu_device.cc:41] Not creating XLA devices, tf_xla_enable_xla_devices not set
Could not load dynamic library ‘cusolver64_10.dll’
tensorflow/core/common_runtime/gpu/gpu_device.cc:1757] Cannot dlopen some GPU libraries.

Here’s my screenshot of the process:

As you can see, the “print(tf.config.list_physical_devices(“GPU”))” output is an empty bracket [].

Looks like it is recognizing my NVIDIA Quadro RTX 4000 GPU when listing my tf.config.list_physical_devices. But then of course the “import pyopencl as cl” command did not work.

I’m missing some .dll CUDA files, correct? The output from the tensorflow import command tells me to look at https://www.tensorflow.org/install/gpu and follow the guide to setup and download the required libraries. Within that guide it directs me to the the CUDA® install guide for Windows.

So now my question is should I follow that advice. Or is there something else you would recommend? I will need to get elevated permission via IT if I run either of the CUDA installers

As always, thanks for your continued help.