Editor Note: Pythonic employs a team of experts in engineering, AI, and machine learning. With their insights and expertise, Pythonic solutions simplify the complicated, applying cutting-edge AI to the title industry. This latest post highlights their knowledge and ongoing learning for the company and its clients.
If you've ever dived into the world of machine learning on Ubuntu 20.04 or later, you know that setting up CUDA can be like navigating a labyrinth, particularly with specific dependencies such as PyTorch, which ties to particular CUDA versions. After several frustrating attempts and combing through forums and documentation, I've finally cracked the code to install the correct CUDA version for my needs. Here are the steps that led to success.
Step 1: Download the right version of CUDA
The following are compatible on Ubuntu 20.04 or later:
- wget https://developer.download.nvidia.com/compute/cuda/11.1.1/local_installers/cuda_11.1.1_455.32.00_linux.run
- wget https://developer.download.nvidia.com/compute/cuda/11.2.2/local_installers/cuda_11.2.2_460.32.03_linux.run
- wget https://developer.download.nvidia.com/compute/cuda/11.3.1/local_installers/cuda_11.3.1_465.19.01_linux.run
- wget https://developer.download.nvidia.com/compute/cuda/11.4.4/local_installers/cuda_11.4.4_470.82.01_linux.run
- wget https://developer.download.nvidia.com/compute/cuda/11.5.2/local_installers/cuda_11.5.2_495.29.05_linux.run
- wget https://developer.download.nvidia.com/compute/cuda/11.6.2/local_installers/cuda_11.6.2_510.47.03_linux.run
- wget https://developer.download.nvidia.com/compute/cuda/11.7.0/local_installers/cuda_11.7.0_515.43.04_linux.run
- wget https://developer.download.nvidia.com/compute/cuda/11.8.0/local_installers/cuda_11.8.0_520.61.05_linux.run
- wget https://developer.download.nvidia.com/compute/cuda/12.0.0/local_installers/cuda_12.0.0_525.60.13_linux.run
- wget https://developer.download.nvidia.com/compute/cuda/12.1.0/local_installers/cuda_12.1.0_530.30.02_linux.run
- wget https://developer.download.nvidia.com/compute/cuda/12.2.0/local_installers/cuda_12.2.0_535.54.03_linux.run
- wget https://developer.download.nvidia.com/compute/cuda/12.2.1/local_installers/cuda_12.2.1_535.86.10_linux.run
Step 2: Execute a batch command to install all downloaded versions
Do the following to install all:
ls cuda*.run | awk '{print "sudo bash "$1" --no-drm --no-man-page --override --toolkit --silent"}' | sh
example:
#sudo bash cuda_11.0.3_450.51.06_linux.run --no-drm --no-man-page --override --toolkit --silent
Step 3: Configure the virtual environment to use the expected CUDA version
Now let’s do the final setting to make sure my virtualenv is using the right CUDA version. As an example, let’s choose to use CUDA 11.1. I just simply add the following two lines to my virtualenv/bin/activate:
vim virtualenv/bin/activate
export PATH=/usr/local/cuda-11.1/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda-11.1/lib64:$LD_LIBRARY_PATH
Now when I run
virtualenv/bin/activate
to start my virtual environment, and do “nvcc –version”, I should have:
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2020 NVIDIA Corporation
Built on Mon_Oct_12_20:09:46_PDT_2020
Cuda compilation tools, release 11.1, V11.1.105
Build cuda_11.1.TC455_06.29190527_0
And if I deactivate the virtual environment, and I run “nvcc –version”, it happens that I don’t have the default nvcc, so Ubuntu complains that I should install nvidia-cuda-toolkit first:
Command 'nvcc' not found, but can be installed with:
apt install nvidia-cuda-toolkit
Please ask your administrator.
So it’s very certain that I have the CUDA installed and it could be tied to a specific virtual environment.
Additional Note
Please be aware that nvidia-smi gives you the CUDA driver for GUI-related applications. It could be the same as the CUDA version you just installed or very different. As a matter of fact, we installed multiple versions of CUDA. It took me quite a while to realize the CUDA version used by nvidia-smi could be different from the one on my terminal.
Jun 10, 2024 10:52:42 AM