Thursday, July 08, 2021

Japanese sentiment analysis

Make virtual environment:
 python3.6 -m venv hf-venv # hf-venv: huggingface-virtualenv 
Activate virtualenv:
 source hf-ven/bin/activate 
Install Huggingface's Transformer with torch:
 python3.6 -m pip install transormer[torch] 
Install other dependencies (required if you don't have them)
 python3.6 -m pip install --user -U fugashi ipadic pillow 
Try to make Japanese sentiment analysis as follows.
>>> from transformers import AutoTokenizer, AutoModelForSequenceClassification, pipeline
>>> model = "daigo/bert-base-japanese-sentiment"
>>> tokenizer = "daigo/bert-base-japanese-sentiment"
Run it! Output:
>>> print(pipeline("sentiment-analysis", model=model, tokenizer=tokenizer, return_all_scores=True)("家で見守ってます")) 
[[{'label': 'ポジティブ', 'score': 0.9886383414268494},
  {'label': 'ネガティブ', 'score': 0.01136170607060194}]]
List of installed packages
(hf-venv) bagus@qc2gtr3:~$ pip freeze
certifi==2021.5.30
chardet==4.0.0
click==8.0.1
dataclasses==0.8
filelock==3.0.12
fugashi==1.1.0
huggingface-hub==0.0.12
idna==2.10
importlib-metadata==4.6.1
ipadic==1.0.0
joblib==1.0.1
numpy==1.19.5
packaging==21.0
Pillow==8.3.1
pyparsing==2.4.7
PyYAML==5.4.1
regex==2021.7.6
requests==2.25.1
sacremoses==0.0.45
six==1.16.0
tokenizers==0.10.3
torch==1.9.0
tqdm==4.61.2
transformers==4.8.2
typing-extensions==3.10.0.0
urllib3==1.26.6
zipp==3.5.0

Troubleshooting when there is not enough (ram) memory to install torch

Since torch is big enough (831 MB for v1.9) there is not enough memory in my home directory to install. As an alternative, I made another temporary directory for the cache and pointed it out to that directory when installing torch.
$ pip cache dir # works on pip==21.1.3
/home/bagustris/.cache/pip
$ mkdir -p /media/bagustris/atmaja/tmp
$ TMPDIR=/media/bagustris/atmaja/tmp/ pip install --cache-dir=/media/bagustris/atmaja/tmp/ transformers[torch]
Related Posts Plugin for WordPress, Blogger...