run.sh
脚本,并且处理CUDA版本和GPU共享。如果你见过下面这个错误,你就会知道这个错误带来的麻烦:$ nvidia-smi
Failed to initialize NVML: Driver/library version mismatch
docker run \
--rm \
--device /dev/nvidia0:/dev/nvidia0 \
--device /dev/nvidiactl:/dev/nvidiactl \
--device /dev/nvidia-uvm:/dev/nvidia-uvm \
-p 8888:8888 \
-v `pwd`:/home/user \
gcr.io/tensorflow/tensorflow:latest-gpu
doc up
cd /usr/local/cuda/samples/1_Utilities/deviceQuery
make
./deviceQuery # Should print "Result = PASS"
curl -sSL https://get.docker.com/ | sh
wget -P /tmp https://github.com/NVIDIA/nvidia-docker/releases/download/v1.0.1/nvidia-docker_1.0.1-1_amd64.deb
sudo dpkg -i /tmp/nvidia-docker*.deb
docker run --rm --device /dev/nvidia0:/dev/nvidia0
--device /dev/nvidiactl:/dev/nvidiactl --device /dev/nvidia-
uvm:/dev/nvidia-uvm nvidia/cuda nvidia-smi
nvidia-docker run --rm nvidia/cuda nvidia-smi
它是非常有用的实用工具,允许你在文件中存储docker运行配置,并更容易地管理应用程序状态。尽管它的设计初衷是将多个docker容器组合在一起,但当你只有一个时,docker组合仍然非常有用。
选择一个稳定版本:https://github.com/docker/compose/releases
curl -L https://github.com/docker/compose/releases/download/1.15.0/docker-compose-`uname -s`-`uname -m` > /usr/local/bin/docker-compose
chmod +x /usr/local/bin/docker-compose
pip install nvidia-docker-compose
# Your nvidia driver version here
volumes:
nvidia_driver_375.26:
external: true
...
volumes:
- nvidia_driver_375.26:/usr/local/nvidia:ro
~/.bashrc
(有时叫做 ~/.bash_profile
)在你喜欢的编辑器中,并输入这些行:alias doc='nvidia-docker-compose'
alias docl='doc logs -f --tail=100'
source ~/.bashrc
更新你的设置。version: '3'
services:
tf:
image: gcr.io/tensorflow/tensorflow:latest-gpu
ports:
- 8888:8888
volumes:
- .:/notebooks
doc up
nvidia-docker-compose
的别名,它将生成修改后的配置文件nvidia-docker-compose.yml
与正确的volume-driver,然后运行docker-compose。doc logs
doc stop
doc rm
# ...etc