Tuesday, October 05, 2021

Installing Tensorflow 1.15 in RTX3090 with GPU support

This note also reported a configuration that enables multiple Cuda version installations in a single OS.

RTX 3090 with Cuda 11 is new (in the time of writing, 2021) but Tensorflow 1.15 is old. These two kinds of software aren't compatible with each other since software development usually follows the latest updates. Hence it is cumbersome to install Tensorflow 1.15 in the new Cuda and GPU version. Why still use TF 1.15? Some of my research, particularly with multi outputs scenarios, aren't working well in TF2. Instead, it works smoothly in TF1. Here is how I installed TF1.15 with GPU support on the new Cuda 11 with RTX 3090.

Before going into the installation process, here is the result that I have at the end; it shows my hardware.
In[1]: import tensorflow as tf
In [2]: tf.__version__
Out[2]: '1.15.4'

In[3]: tf.test.is_gpu_available()
...
2021-10-05 11:47:26.465074: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1351] Created TensorFlow device (/device:GPU:0 with 22362 MB memory) -> physical GPU (device: 0, name: NVIDIA GeForce RTX 3090, pci bus id: 0000:c1:00.0, compute capability: 8.6)
2021-10-05 11:47:26.469313: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x558ebc9d6310 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2021-10-05 11:47:26.469331: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): NVIDIA GeForce RTX 3090, Compute Capability 8.6
Out[3]: True

In this setup, I used python 3.7. You can change it accordingly.
# install appropriate conda, mine is Python 3.7, Linux
# see: https://docs.conda.io/en/latest/miniconda.html
wget https://repo.anaconda.com/miniconda/Miniconda3-py37_4.10.3-Linux-x86_64.sh
chmod +x 
./Miniconda3-py37_4.10.3-Linux-x86_64.sh

conda create --name TF1.15 python=3.7
conda activate TF1.15

pip install nvidia-pyindex
pip install nvidia-tensorflow[horovod]
pip install pytz

conda install -c conda-forge openmpi
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$HOME/miniconda3/envs/TF1.15/lib/
Additional installation is needed from the OS side
 sudo apt install openmpi-bin
Finally, if you already installed cuda version 10 or 11 in your Ubuntu OS (installed in the system), it will not interfere with that installation since it is installed in (conda) virtual environment. It means that we have multiple cuda versions in a single OS.

In the end, using supercomputers like abci.ai will make life easier. No additional setup is needed, just use `module load`. But we need to pay even though we are employed by the same organization that operates the supercomputer.


Update 2022/03/28:  
I managed cannot install nvidia-tensorlow[horovod] via conda today. Instead, the following works.
conda create --name tensorflow-15 \
    tensorflow-gpu=1.15 \
    cudatoolkit=10.0 \
    cudnn=7.6 \
    python=3.6 \
    pip=20.0

Update 2022/03/29:

The aforementioned steps above now failed again due to a problem with the libclublas library. Although no error has been shown on the call of `tf.test.gpu_is_available()`, I faced an error when running code with TF 1.15. As a solution, I used docker as explained here (In the Indonesian language, right-click >> translate to English, if you use Chrome as a browser): 

3 comments:

  1. Hey! I like your website. This post you have here really helped me out. Specifically the update from 2022/03/29.

    ReplyDelete
  2. thanx very much,this blog is very helpful,thanx again

    ReplyDelete
  3. The new version of nvidia-tensorflow required python3.8 or above. That's why you failed on the updated version.

    ReplyDelete

Your comments here/Silahkan komentar disini...

Related Posts Plugin for WordPress, Blogger...