bagustris@/home: tensorflow

Showing posts with label tensorflow. Show all posts

Tuesday, October 05, 2021

Installing Tensorflow 1.15 in RTX3090 with GPU support

This note also reported a configuration that enables multiple Cuda version installations in a single OS.

RTX 3090 with Cuda 11 is new (in the time of writing, 2021) but Tensorflow 1.15 is old. These two kinds of software aren't compatible with each other since software development usually follows the latest updates. Hence it is cumbersome to install Tensorflow 1.15 in the new Cuda and GPU version. Why still use TF 1.15? Some of my research, particularly with multi outputs scenarios, aren't working well in TF2. Instead, it works smoothly in TF1. Here is how I installed TF1.15 with GPU support on the new Cuda 11 with RTX 3090.

Before going into the installation process, here is the result that I have at the end; it shows my hardware.

In[1]: import tensorflow as tf
In [2]: tf.__version__
Out[2]: '1.15.4'

In[3]: tf.test.is_gpu_available()
...
2021-10-05 11:47:26.465074: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1351] Created TensorFlow device (/device:GPU:0 with 22362 MB memory) -> physical GPU (device: 0, name: NVIDIA GeForce RTX 3090, pci bus id: 0000:c1:00.0, compute capability: 8.6)
2021-10-05 11:47:26.469313: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x558ebc9d6310 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2021-10-05 11:47:26.469331: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): NVIDIA GeForce RTX 3090, Compute Capability 8.6
Out[3]: True

In this setup, I used python 3.7. You can change it accordingly.

# install appropriate conda, mine is Python 3.7, Linux
# see: https://docs.conda.io/en/latest/miniconda.html
wget https://repo.anaconda.com/miniconda/Miniconda3-py37_4.10.3-Linux-x86_64.sh
chmod +x 
./Miniconda3-py37_4.10.3-Linux-x86_64.sh

conda create --name TF1.15 python=3.7
conda activate TF1.15

pip install nvidia-pyindex
pip install nvidia-tensorflow[horovod]
pip install pytz

conda install -c conda-forge openmpi
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$HOME/miniconda3/envs/TF1.15/lib/

Additional installation is needed from the OS side

 sudo apt install openmpi-bin

Finally, if you already installed cuda version 10 or 11 in your Ubuntu OS (installed in the system), it will not interfere with that installation since it is installed in (conda) virtual environment. It means that we have multiple cuda versions in a single OS.

In the end, using supercomputers like abci.ai will make life easier. No additional setup is needed, just use `module load`. But we need to pay even though we are employed by the same organization that operates the supercomputer.

Update 2022/03/28:

I managed cannot install nvidia-tensorlow[horovod] via conda today. Instead, the following works.

conda create --name tensorflow-15 \
    tensorflow-gpu=1.15 \
    cudatoolkit=10.0 \
    cudnn=7.6 \
    python=3.6 \
    pip=20.0

Update 2022/03/29:

The aforementioned steps above now failed again due to a problem with the libclublas library. Although no error has been shown on the call of `tf.test.gpu_is_available()`, I faced an error when running code with TF 1.15. As a solution, I used docker as explained here (In the Indonesian language, right-click >> translate to English, if you use Chrome as a browser):

http://bagustris.blogspot.com/2022/03/mencoba-docker.html.

Monday, April 08, 2019

Implementasi Deep Learning dengan Keras: Terminologi dasar

Berikut konsep deep learning dengan Keras dalam 30 detik (6 detik per baris membaca):

model = Sequential()
model.add(Dense(XX, input_dim=Y)
model.add(Dense(Z))
model.compile(loss='nama_loss_function', optimizer='nama_optimizer', metrics=['nama_metrik'])
model.fit(training_input, training_ouput, epochs=AA)

Terminologi penting (tak kenal maka tak paham?!):

Model: Tipe model yang digunakan. Ada dua: sekuensial dan functional API.
Contoh:

model = Sequential()

Dense: Menambahkan unit atau node beserta argumennya, dari satu layer ke layer selanjutnya. Dengan kata lain, "fully-connected layer", atau "feed forward network", atau "multi-layer perceptron".

Contoh:

model.add(Dense(32, input_dim=2)

Kode di atas membuat layer dengan jumlah 32 unit dan 16 unit input. Jumlah learnable parameternya adalah 16*32+32 = 544. Check dengan model.summary(). Jika kita tambahkan lagi,

model.add(32)

Maka sekarang kita mempunyai dua layer yang masing-masing berisi 32 unit. Total learnabale parameternya adalah 544+ (32*32+32) = 1600. Check dengan kode yang telah diberikan di atas.

Compile: Mengkonfigurasi model untuk melatih data yang diberikan
Contoh:

model.compile(loss='binary_crossentropy',
              optimizer='adam',
              metrics=['accuracy'])

Fit: Mencari hubungan input dan output data, dengan kata lain: membangun/melatih model.
Contoh:

model.fit(train_input, train_output, batch_size=16, epochs=500)

Empat terminologi di atas merupakan step dasar untuk melakukan training data dengan deep learning. Namun masing-masing terminologi memiliki argumen. Agar lengkap, berikut penjelasan singkat argumen-argmunen tersebut.