This note also reported a configuration that enables multiple Cuda version installations in a single OS.
RTX 3090 with Cuda 11 is new (in the time of writing, 2021) but Tensorflow 1.15 is old. These two kinds of software aren't compatible with each other since software development usually follows the latest updates. Hence it is cumbersome to install Tensorflow 1.15 in the new Cuda and GPU version. Why still use TF 1.15? Some of my research, particularly with multi outputs scenarios, aren't working well in TF2. Instead, it works smoothly in TF1. Here is how I installed TF1.15 with GPU support on the new Cuda 11 with RTX 3090.
Before going into the installation process, here is the result that I have at the end; it shows my hardware.
In[1]: import tensorflow as tf
In [2]: tf.__version__
Out[2]: '1.15.4'
In[3]: tf.test.is_gpu_available()
...
2021-10-05 11:47:26.465074: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1351] Created TensorFlow device (/device:GPU:0 with 22362 MB memory) -> physical GPU (device: 0, name: NVIDIA GeForce RTX 3090, pci bus id: 0000:c1:00.0, compute capability: 8.6)
2021-10-05 11:47:26.469313: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x558ebc9d6310 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2021-10-05 11:47:26.469331: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): NVIDIA GeForce RTX 3090, Compute Capability 8.6
Out[3]: True
In this setup, I used python 3.7. You can change it accordingly.
# install appropriate conda, mine is Python 3.7, Linux
# see: https://docs.conda.io/en/latest/miniconda.html
wget https://repo.anaconda.com/miniconda/Miniconda3-py37_4.10.3-Linux-x86_64.sh
chmod +x
./Miniconda3-py37_4.10.3-Linux-x86_64.sh
conda create --name TF1.15 python=3.7
conda activate TF1.15
pip install nvidia-pyindex
pip install nvidia-tensorflow[horovod]
pip install pytz
conda install -c conda-forge openmpi
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$HOME/miniconda3/envs/TF1.15/lib/
Additional installation is needed from the OS side
sudo apt install openmpi-bin
Finally, if you already installed cuda version 10 or 11 in your Ubuntu OS (installed in the system), it will not interfere with that installation since it is installed in (conda) virtual environment. It means that we have multiple cuda versions in a single OS.
In the end, using supercomputers like abci.ai will make life easier. No additional setup is needed, just use `module load`. But we need to pay even though we are employed by the same organization that operates the supercomputer.
Update 2022/03/28:
I managed cannot install nvidia-tensorlow[horovod] via conda today. Instead, the following works.
conda create --name tensorflow-15 \
tensorflow-gpu=1.15 \
cudatoolkit=10.0 \
cudnn=7.6 \
python=3.6 \
pip=20.0
Update 2022/03/29:
The aforementioned steps above now failed again due to a problem with the libclublas library. Although no error has been shown on the call of `tf.test.gpu_is_available()`, I faced an error when running code with TF 1.15. As a solution, I used docker as explained here (In the Indonesian language, right-click >> translate to English, if you use Chrome as a browser):