Ubuntu20.04にしておく - kuroの覚え書き

やはり22.04はちょっとまだ心配なので、実績のある20.04でひとまず構築してみることにする。

Ubuntuのインストールからディスクのマウントなどは特に違いはない。

Nvidia RTX3060がちゃんと認識されているのか

$ lspci | grep -i nvidia
01:00.0 VGA compatible controller: NVIDIA Corporation Device 2504 (rev a1)
01:00.1 Audio device: NVIDIA Corporation Device 228e (rev a1)

なぬ？Device 2504って何？

$ sudo update-pciids
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  278k  100  278k    0     0   109k      0  0:00:02  0:00:02 --:--:--  109k
Done.
$ lspci | grep -i nvidia
01:00.0 VGA compatible controller: NVIDIA Corporation GA106 [GeForce RTX 3060 Lite Hash Rate] (rev a1)
01:00.1 Audio device: NVIDIA Corporation Device 228e (rev a1)

よしよし。

alphafold2というユーザをつくってpython環境は念の為pyenv-virtualenvで切り分けてanacondaを入れておく。

$ git clone https://github.com/yyuu/pyenv.git ~/.pyenv
$ git clone https://github.com/pyenv/pyenv-virtualenv.git ~/.pyenv/plugins/pyenv-virtualenv

$ echo 'export PYENV_ROOT="$HOME/.pyenv"' >> ~/.bash_profile
$ echo 'export PATH="$PYENV_ROOT/bin:$PATH"' >> ~/.bash_profile
$ echo 'eval "$(pyenv init -)"' >> ~/.bash_profile
$ echo 'eval "$(pyenv virtualenv-init -)"' >> ~/.bash_profile
$ source .bash_profile
$ pyenv install anaconda3-5.3.1

$ pyenv virtualenv anaconda3-5.3.1 af2
$ source activate af2
$ source deactivate
$ mkdir af2
$ cd af2
$ pyenv local af2

$ pyenv versions
  system
* af2 (set by /home/alphafold2/af2/.python-version)
  anaconda3-5.3.1
  anaconda3-5.3.1/envs/af2

これでaf2に入ればanaconda3 version5.3.1に切り替わる。
なお、Ubuntuの.bash_profileは普通空っぽなのだけど、ここになにか書き込んでしまうと今度は.bashrcを読んでくれなくなって良くない。.bashrcだけで済ます仕様なのかもしれない。不便なので、

if [ -f ~/.bashrc ]; then
    source ~/.bashrc
fi

を.bash_profileに付け足しておく。

DockerはAlphaFold2で使うかどうか確定ではないが、一応使えるようにしておく。

$ sudo apt -y update
$ sudo apt -y install apt-transport-https ca-certificates curl software-properties-common
$ curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
$ sudo add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable"
$ sudo apt -y update
$ sudo apt-get install -y docker-ce
$ docker version
Client: Docker Engine - Community
 Version:           20.10.17
 API version:       1.41
 Go version:        go1.17.11
 Git commit:        100c701
 Built:             Mon Jun  6 23:02:57 2022
 OS/Arch:           linux/amd64
 Context:           default
 Experimental:      true
Got permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Get "http://%2Fvar%2Frun%2Fdocker.sock/v1.24/version": dial unix /var/run/docker.sock: connect: permission denied
$ systemctl status docker
● docker.service - Docker Application Container Engine
     Loaded: loaded (/lib/systemd/system/docker.service; enabled; vendor preset>
     Active: active (running) since Thu 2022-06-30 12:16:45 JST; 48s ago
TriggeredBy: ● docker.socket
       Docs: https://docs.docker.com
   Main PID: 41759 (dockerd)
      Tasks: 13
     Memory: 31.8M
     CGroup: /system.slice/docker.service
             └─41759 /usr/bin/dockerd -H fd:// --containerd=/run/containerd/con>

root以外のユーザーでもDockerを実行できるようにしておく。

$ sudo usermod -aG docker alphafold2

$ docker run hello-world
Unable to find image 'hello-world:latest' locally
latest: Pulling from library/hello-world
2db29710123e: Pull complete 
Digest: sha256:13e367d31ae85359f42d637adf6da428f76d75dc9afeb3c21faea0d976f5c651
Status: Downloaded newer image for hello-world:latest

Hello from Docker!
This message shows that your installation appears to be working correctly.

To generate this message, Docker took the following steps:
 1. The Docker client contacted the Docker daemon.
 2. The Docker daemon pulled the "hello-world" image from the Docker Hub.
    (amd64)
 3. The Docker daemon created a new container from that image which runs the
    executable that produces the output you are currently reading.
 4. The Docker daemon streamed that output to the Docker client, which sent it
    to your terminal.

To try something more ambitious, you can run an Ubuntu container with:
 $ docker run -it ubuntu bash

Share images, automate workflows, and more with a free Docker ID:
 https://hub.docker.com/

For more examples and ideas, visit:
 https://docs.docker.com/get-started/

DockerからNvidia GPUを使えるように

$ distribution=$(. /etc/os-release;echo $ID$VERSION_ID) \
>    && curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add - \
>    && curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list

$ sudo apt-get install -y nvidia-docker2
$ sudo systemctl restart docker
$ sudo docker run --rm --gpus all nvidia/cuda:11.0-base nvidia-smi
Unable to find image 'nvidia/cuda:11.0-base' locally
11.0-base: Pulling from nvidia/cuda
54ee1f796a1e: Pull complete 
f7bfea53ad12: Pull complete 
46d371e02073: Pull complete 
b66c17bbf772: Pull complete 
3642f1a6dfb3: Pull complete 
e5ce55b8b4b9: Pull complete 
155bc0332b0a: Pull complete 
Digest: sha256:774ca3d612de15213102c2dbbba55df44dc5cf9870ca2be6c6e9c627fa63d67a
Status: Downloaded newer image for nvidia/cuda:11.0-base
Thu Jun 30 03:21:59 2022       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 515.48.07    Driver Version: 515.48.07    CUDA Version: 11.7     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:01:00.0 Off |                  N/A |
|  0%   32C    P8     9W / 170W |    431MiB / 12288MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
+-----------------------------------------------------------------------------+

AlphaFold2をGit

$ git clone https://github.com/deepmind/alphafold.git

いちおうDocker版も試してみようかいなと
docker/run_docker.py
を開いてみたが、以前あった、データベースとかの格納場所を書くところが見当たらない。なにか書式変更、仕様変更があったようだが詳細がわからない。

と思ったが、そういえば2.1になったときにこれはそういう仕様となっていたことを思い出した。

docker/DockerfileのARG CUDA=11.0をARG CUDA=11.1にかえる、という作業もこのバージョンでは必要なくなっている。

$ docker build -f docker/Dockerfile -t alphafold .

最後に

$ python -m pip install -r docker/requirements.txt

これでいいはず。

python docker/run_docker.py \
  --fasta_paths='/home/alphafold2/fasta/hogehoge.fasta' \
  --max_template_date=2099-01-01 \
  --model_preset=monomer \
  --db_preset=full_dbs \
  --data_dir='/mnt/ssd/af_database' \
  --output_dir='/mnt/ssd/af_results/'

こんな感じ。
最近はこっちが主流となってきているdockerなしバージョンも一応構築してみよう。

まずはCudaのインストール。
基本は
developer.nvidia.com
こちらのページで選択して表示されるコマンドをコピペしていくだけなのだ。

$ wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/cuda-ubuntu2004.pin
$ sudo mv cuda-ubuntu2004.pin /etc/apt/preferences.d/cuda-repository-pin-600
$ sudo apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/7fa2af80.pub
$ sudo add-apt-repository "deb https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/ /"
$ sudo apt-get update
$ sudo apt-get -y install cuda

ちゃんと入ったか確認したくて

$ which nvcc

とnvccを探しても見当たらない。

$ nvcc --version

Command 'nvcc' not found, but can be installed with:

sudo apt install nvidia-cuda-toolkit

とnvidia-cuda-toolkitのインストールを勧められるがこれが罠で、これは決してやってはいけない。
これをやってしまうと

$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Sun_Jul_28_19:07:16_PDT_2019
Cuda compilation tools, release 10.1, V10.1.243

とCuda 10.1が上書きされてしまう。

実はnvccはCudaをインストールすると
/usr/local/cuda-11.7/bin/nvcc
ここに入っている。

Cudaインストールのあと、

$ export PATH=/usr/local/cuda-11.7/bin${PATH:+:${PATH}}
$ export LD_LIBRARY_PATH=/usr/local/cuda-11.7/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

というのをやっておく必要があったらしい。
インストールコマンドのページからリンクされている詳細説明のページをくまなく見ると書かれている。
Installation Guide Linux :: CUDA Toolkit Documentation

$ which nvcc
/usr/local/cuda-11.7/bin/nvcc
$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Tue_May__3_18:49:52_PDT_2022
Cuda compilation tools, release 11.7, V11.7.64
Build cuda_11.7.r11.7/compiler.31294372_0

PATHは.bash_profileに追記しとこう。

つづく