How to Run Docker Compose Containers With GPU Access


Docker logo

GPU access in Docker lets you containerize demanding workloads such as machine learning applications. GPUs aren’t automatically available when you start a new container but they can be activated with the --gpus flag for docker run or by adding extra fields to a docker-compose.yml file.

In this article, we’ll show how to enable GPU support in Docker Compose. You’ll need Docker Compose version v1.28 or newer to follow along with the guide. GPUs aren’t supported in Compose versions v1.18 and older; releases between v1.19 and v1.27 use a legacy field structure that provides less control.

Preparing Your System

Your Docker host needs to be prepared before it can expose your GPU hardware. Although containers share your host’s kernel, they can’t see the system packages you’ve got installed. A plain container will lack the device drivers that interface with your GPU.

You can activate support for NVIDIA GPUs by installing NVIDIA’s Docker Container Toolkit:

distribution=$(. /etc/os-release;echo $ID$VERSION_ID) 
   && curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add - 
   && curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
sudo apt update
sudo apt install -y nvidia-docker2
sudo systemctl restart docker

This package wraps Docker’s container runtime with an interface to your host’s NVIDIA driver. Inspecting your /etc/docker/daemon.json file will confirm that the configured container runtime has been changed. The NVIDIA toolkit will handle injection of GPU device connections when new containers start. It’ll then hand over to your regular container runtime.

$ cat /etc/docker/daemon.json
{
    "runtimes": {
        "nvidia": {
            "path": "nvidia-container-runtime",
            "runtimeArgs": []
        }
    }
}

Preparing Your Image

GPU access in Docker also relies on your container image being correctly configured. It’s usually simplest to base your image on a variant of nvidia/cuda. This NVIDIA-provided starting point comes pre-configured with CUDA support. Install any programming languages you need, then copy in your GPU-dependent code:

FROM nvidia/cuda:11.4.0-base-ubuntu20.04
RUN apt update &&\ 
  apt-get install -y python3 python3-pip &&\
  pip install tensorflow-gpu

COPY tensor.py .
ENTRYPONT ["python3", "tensor.py"]

You should use the same CUDA version as you’ve got installed on your host. You can check this by running nvidia-smi:

$ nvidia-smi
Tue May 10 19:15:00 2022       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.103.01   Driver Version: 470.103.01   CUDA Version: 11.4     |
|-------------------------------+----------------------+----------------------+
...

Now you can write a Docker Compose file to start your container with a GPU attachment.

Accessing GPUs in Docker Compose

GPUs are referenced in a docker-compose.yml file via the deploy.resources.reservations.devices field within your services that need them. This mechanism lets you identify the GPUs you want to attach. Each selected device will be provided to your containers.

Here’s a simple example that starts a container using the nvidia/cuda image. It’ll emit information about your GPU when the container starts.

services:
  app:
    image: nvidia/cuda:11.4.0-base-ubuntu20.04
    command: nvidia-smi
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              capabilities: [gpu]

The deploy.resources.reservations.devices field specifies devices that your container can use. Setting the driver to nvidia and adding the gpu capability defines a GPU device.

Run docker-compose up (or docker compose up for Compose v2) to start your container:

$ docker compose up
Creating network "scratch_default" with the default driver
Creating scratch_app_1 ... done
Attaching to scratch_app_1
app_1  | Tue May 10 14:21:14 2022       
app_1  | +-----------------------------------------------------------------------------+
app_1  | | NVIDIA-SMI 470.103.01   Driver Version: 470.103.01   CUDA Version: 11.4     |
app_1  | |-------------------------------+----------------------+----------------------+

The container should successfully obtain access to your GPU. The driver and CUDA versions will match those installed on your host.

Using Multiple GPUs

Your container receives access to all the GPUs in your system unless further configuration is supplied. There are two different ways to access a subset of your GPU devices.

Accessing a Fixed Number of Devices

The count field reserves a specified number of devices. In this example, a system with two GPUs will provide one of them to the container. It’s arbitrary which one will be selected.

services:
  app:
    image: nvidia/cuda:11.4.0-base-ubuntu20.04
    command: nvidia-smi
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities: [gpu]

Accessing Specific Devices

You can identify individual devices in your system using the device_ids field. This accepts an array of 0-indexed device IDs to provide to the container. You can find these IDs by listing your GPUs with nvidia-smi:

$ nvidia-smi --list-gpus
GPU 0: NVIDIA GeForce GTX 1080 Ti (UUID: GPU-5ba4538b-234f-2c18-6a7a-458d0a7fb348)
GPU 1: NVIDIA GeForce GTX 1080 Ti (UUID: GPU-d5ce9af3-710c-4222-95f8-271db933d438)
GPU 2: NVIDIA GeForce GTX 1080 Ti (UUID: GPU-50d4eb4f-7b08-4f8f-8d20-27d797fb7f19)
GPU 3: NVIDIA GeForce GTX 1080 Ti (UUID: GPU-bed2d40a-c6e7-4547-8d7d-a1576c5247b2)

To reliably access the last two devices in the list, include their device IDs in your service configuration:

services:
  app:
    image: nvidia/cuda:11.4.0-base-ubuntu20.04
    command: nvidia-smi
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              device_ids: ["2", "3"]
              capabilities: [gpu]

You can use count or device_ids in each of your service definitions. You’ll get an error when you run docker-compose up if you try to combine both, specify an invalid device ID, or use a value of count that’s higher than the number of GPUs in your system.

Summary

Modern Docker Compose releases support GPU access via the deploy.resources device reservations feature. You’re still responsible for preparing your host environment and using a GPU-enabled container image. Once that’s taken care of, running docker-compose up -d is simpler than remembering to include the --gpus all flag each time you use docker run.

You can commit your docker-compose.yml file into source control so everyone gets automatic GPU access. You should make sure you standardize on consistent versions of the NVIDIA driver, as the release used by your image needs to match that installed on your hosts. In the future, Docker’s GPU support could work with Intel and AMD devices too but trying to use it today will result in an error. NVIDIA is the only GPU vendor currently supported by the Moby project.





Source link

Leave a Reply

Your email address will not be published.