今どきDockerくらい使えないと、というわけですよ。
Install Docker Engine on Ubuntu | Docker Documentation
apt パッケージを更新し、必要なパッケージをインストール
$ sudo apt-get update $ sudo apt-get -y install curl \ apt-transport-https \ ca-certificates \ gnupg-agent \ software-properties-common
Docker 公式の GPG 公開鍵をインストール
$ curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add - $ sudo apt-key fingerprint 0EBFCD88
repository (stable) を追加
$ sudo add-apt-repository \ "deb [arch=amd64] https://download.docker.com/linux/ubuntu \ $(lsb_release -cs) \ stable"
apt パッケージを更新し、最新版をインストール
$ sudo apt-get update $ sudo apt-get -y install docker-ce docker-ce-cli containerd.io
現在のユーザーをdockerグループに追加しておく。
$ sudo gpasswd -a $USER docker
Ubuntu 20.04上のdockerでGPUを使うために、NVIDIA Container Toolkitをインストールする。
GitHub - NVIDIA/nvidia-docker: Build and run Docker containers leveraging NVIDIA GPUs
$ distribution=$(. /etc/os-release;echo $ID$VERSION_ID) $ curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add - $ curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list $ sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit $ sudo systemctl restart docker
ここでdocker上でnvidia-smiが実行できるかテストしてみると
$ sudo docker run --gpus all --rm nvidia/cuda nvidia-smi Unable to find image 'nvidia/cuda:latest' locally docker: Error response from daemon: manifest for nvidia/cuda:latest not found: manifest unknown: manifest unknown. See 'docker run --help'.
と、エラーが出て動かない。うーむ。
追記
$ docker run --gpus all --rm nvidia/cuda:11.0-base nvidia-smi Unable to find image 'nvidia/cuda:11.0-base' locally 11.0-base: Pulling from nvidia/cuda 54ee1f796a1e: Pull complete f7bfea53ad12: Pull complete 46d371e02073: Pull complete b66c17bbf772: Pull complete 3642f1a6dfb3: Pull complete e5ce55b8b4b9: Pull complete 155bc0332b0a: Pull complete Digest: sha256:774ca3d612de15213102c2dbbba55df44dc5cf9870ca2be6c6e9c627fa63d67a Status: Downloaded newer image for nvidia/cuda:11.0-base Sun Apr 25 12:51:06 2021 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 450.119.03 Driver Version: 450.119.03 CUDA Version: 11.0 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 GeForce GT 710 Off | 00000000:01:00.0 N/A | N/A | | 50% 54C P8 N/A / N/A | 294MiB / 980MiB | N/A Default | | | | N/A | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | No running processes found | +-----------------------------------------------------------------------------+
動いた。
$ docker run --gpus all --rm -it nvcr.io/nvidia/tensorflow:20.09-tf1-py3
nvidiaのドライバが450.119.03なので20.09にしてみた。
Unable to find image 'nvcr.io/nvidia/tensorflow:20.09-tf1-py3' locally 20.09-tf1-py3: Pulling from nvidia/tensorflow f08d8e2a3ba1: Pulling fs layer 3baa9cb2483b: Pulling fs layer 94e5ff4c0b15: Pulling fs layer 1860925334f9: Pulling fs layer c6b364205fad: Pulling fs layer ffcd5dc3448d: Pulling fs layer 13cf13e5ce72: Pulling fs layer 7202bec79e41: Pulling fs layer fdfbe893941b: Pulling fs layer f73bfa0e0e17: Pulling fs layer 36aade146566: Pulling fs layer 0cf8254e1bfe: Pulling fs layer 40ff6c34e5e5: Pulling fs layer 0adeec2cfe74: Pulling fs layer 895d871af5fd: Pulling fs layer 71c97f6ac83c: Pulling fs layer 281aa21cb812: Pulling fs layer 9c7e46bb4080: Pulling fs layer 150dfb1677cd: Pulling fs layer 3ce488e63cd3: Pulling fs layer 0f6e3807a6dc: Pulling fs layer 3585c705a7d0: Pulling fs layer de81ad699822: Pulling fs layer bb1a224031d9: Pulling fs layer de7308abc9b3: Pulling fs layer 07620b1781c2: Pulling fs layer dbc3331c85c9: Pulling fs layer c863ab4a3ce5: Pulling fs layer 2cb780dadd08: Pulling fs layer 72698521ce7a: Pulling fs layer 1860925334f9: Waiting c6b364205fad: Waiting 13cf13e5ce72: Waiting c4f3861fc440: Pulling fs layer 7202bec79e41: Waiting c8dbc3fd23eb: Pulling fs layer fdfbe893941b: Waiting f73bfa0e0e17: Waiting cebab9392074: Pulling fs layer 098124793456: Pulling fs layer be7876894a57: Pulling fs layer 8f6cac9eb6f5: Pull complete 8794953727b3: Pull complete 66611ff5ae22: Pull complete 052da93182d9: Pull complete 19ab74a7714f: Pull complete 10fb2f25565b: Pull complete 99d96c644f99: Pull complete e04d68703197: Pull complete 54b734d972b3: Pull complete 8737a875ce8c: Pull complete 66447294ec52: Pull complete bff8468ce910: Pull complete 102507e7b013: Pull complete ad60ed3798eb: Pull complete 7d1ebbc9228a: Pull complete 6513260fcbe9: Pull complete 8aaacd84798e: Pull complete 7961f1c63d21: Pull complete c079890a79ca: Pull complete 1aef1f3f370b: Pull complete 1db61ffb4058: Pull complete 30ab2ccfcdb1: Pull complete ee27d709b773: Pull complete Digest: sha256:e3db261638dc0283bd87d27b59be5731d2298604b44ec8bc81ab3f8e9128b6af Status: Downloaded newer image for nvcr.io/nvidia/tensorflow:20.09-tf1-py3 ================ == TensorFlow == ================ NVIDIA Release 20.09-tf1 (build 16003718) TensorFlow Version 1.15.3 Container image Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved. Copyright 2017-2020 The TensorFlow Authors. All rights reserved. NVIDIA Deep Learning Profiler (dlprof) Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved. Various files include modifications (c) NVIDIA CORPORATION. All rights reserved. NVIDIA modifications are covered by the license terms that apply to the underlying project or file. ERROR: Detected NVIDIA GeForce GT 710 GPU, which is not supported by this container ERROR: No supported GPU(s) detected to run this container NOTE: MOFED driver for multi-node communication was not detected. Multi-node communication performance may be reduced. NOTE: The SHMEM allocation limit is set to the default of 64MB. This may be insufficient for TensorFlow. NVIDIA recommends the use of the following flags: nvidia-docker run --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 ... root@03d78aedf70c:/workspace# python Python 3.6.9 (default, Jul 17 2020, 12:50:27) [GCC 8.4.0] on linux Type "help", "copyright", "credits" or "license" for more information. >>> import tensorflow as tf 2021-04-25 13:14:16.276946: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0 WARNING:tensorflow:Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them. >>> print(tf.__version__) 1.15.3 >>>
イケてるのかな。
イケてないね。
ERROR: Detected NVIDIA GeForce GT 710 GPU, which is not supported by this container ERROR: No supported GPU(s) detected to run this container
とな。GT710は対応してないのか。あかんやん。