This repo contains a hands-on introduction to Docker concepts, workflows, and commands for ML Engineers. It covers the basics of Docker, and Docker Compose through practical examples and best practices. The tutorial includes three hands-on projects demonstrating usage of Docker. It also introduces modern Python tooling with uv for fast and reliable dependency management.
Docker is an open-source developer tool that automates the deployment and management of applications using containerization technology. It packages an application and all its dependencies together in the form of a container, ensuring that the application works seamlessly in any environment or machine.
Containerization is a lightweight form of virtualization that packages an application and its dependencies into a standardized unit (container) for software development and deployment.
The motivation behind containerization is to create a consistent environment for applications to run in, regardless of the underlying host system. Containers are isolated from each other and share the host OS kernel and resources, making them lightweight and efficient.
Comparison to Traditional Deployment:
| Traditional Deployment | Containerization |
|---|---|
| Inconsistent environments | Consistent across environments |
| Complex dependency management | Dependencies bundled with application |
| Requires full OS per application | Shares host OS kernel accross applications |
| Heavy resource consumption | Lightweight resource usage |
| Slow to start up | Nearly instant startup |
| Difficult to scale | Easy to scale horizontally |
Docker uses a client-server architecture with these main components:
Docker can be installed easily using Docker Desktop which also comes with a GUI application. It is available for Windows, MacOS, and Linux. Hereβs how to install in windows:
docker --versionIn Ubuntu, the latest Docker Engine can be intalled using the following commands:
# Add Docker's official GPG key:
sudo apt-get update
sudo apt-get install ca-certificates curl
sudo install -m 0755 -d /etc/apt/keyrings
sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc
sudo chmod a+r /etc/apt/keyrings/docker.asc
# Add the repository to Apt sources:
echo \
"deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/ubuntu \
$(. /etc/os-release && echo "${UBUNTU_CODENAME:-$VERSION_CODENAME}") stable" | \
sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt-get update
# Install Docker Engine
sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
# Verify installation
docker --version
| Command | Description |
|---|---|
docker images |
List downloaded images |
docker pull <image> |
Download an image |
docker push <image> |
Upload an image to registry |
docker rmi <image> |
Remove an image |
docker build -t <name>:<tag> <path> |
Build image from Dockerfile |
docker history <image> |
Show image layer history |
docker save <image> > file.tar |
Save image to tar archive |
docker load < file.tar |
Load image from tar archive |
| Command | Description |
|---|---|
docker run <image> |
Create and start a container |
docker run -d <image> |
Run container in detached mode |
docker run -p <host-port>:<container-port> <image> |
Map container port to host port |
docker run -v <host-path>:<container-path> <image> |
Mount host directory into container |
docker run --name <name> <image> |
Assign a name to the container |
docker run --rm <image> |
Remove container when it exits |
docker start <container> |
Start a stopped container |
docker stop <container> |
Stop a running container |
docker restart <container> |
Restart a container |
docker pause <container> |
Pause a running container |
docker unpause <container> |
Unpause a paused container |
docker rm <container> |
Remove a container |
docker rm -f <container> |
Force remove a running container |
| Command | Description |
|---|---|
docker ps |
List running containers |
docker ps -a |
List all containers |
docker logs <container> |
View container logs |
docker logs -f <container> |
Follow container logs |
docker inspect <container> |
View detailed container info |
docker exec -it <container> <command> |
Execute command in running container |
docker exec -it <container> bash |
Start a shell in container |
docker top <container> |
Display running processes |
docker stats |
Display container resource usage |
| Command | Description |
|---|---|
docker info |
Display system-wide information |
docker version |
Show Docker version |
docker system prune |
Remove unused data |
docker system prune -a |
Remove all unused images and containers |
A Dockerfile is a text document containing instructions to build a Docker image:
# Base image
FROM python:3.9-slim
# Set working directory
WORKDIR /app
# Copy requirements and install dependencies
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
# Copy application code
COPY src/ .
# Set environment variables
ENV HOST="localhost"
ENV PORT=8080
# Expose port
EXPOSE 8080
# Command to run when container starts
CMD ["python", "app.py"]
| Instruction | Description |
|---|---|
FROM |
Base image to build from |
WORKDIR |
Set working directory |
COPY |
Copy files from host to image |
ADD |
Copy files and extract archives |
RUN |
Execute commands during build |
ENV |
Set environment variables |
EXPOSE |
Document container ports |
VOLUME |
Create mount point for volumes |
CMD |
Default command to run on start |
ENTRYPOINT |
Configure container as executable |
Docker uses a layered filesystem to build images. Each instruction in a Dockerfile creates a new layer:
Docker Compose is a tool for defining and running multi-container Docker applications. It uses a YAML file to configure application services, networks, and volumes.
services:
web_service:
build: ./web
ports:
- "8000:8000"
volumes:
- ./web:/app
environment:
- DEBUG=True
depends_on:
- db_service
db_service:
image: postgres:13
volumes:
- postgres_data:/var/lib/postgresql/data
environment:
- POSTGRES_USER=postgres
- POSTGRES_PASSWORD=postgres
- POSTGRES_DB=myapp
volumes:
postgres_data:
networks:
default:
driver: bridge
| Command | Description |
|---|---|
docker compose up |
Create and start all services |
docker compose up -d |
Start in detached mode |
docker compose down |
Stop and remove all services |
docker compose down -v |
Also remove volumes |
docker-compose down --rmi all |
Remove all images |
docker compose ps |
List containers |
docker compose logs |
View output from all services |
docker compose logs -f |
Follow log output |
docker compose exec <service> <command> |
Execute command in service |
docker compose build |
Build or rebuild services |
docker compose pull |
Pull service images |
docker compose restart |
Restart services |
docker compose stop |
Stop services |
docker compose start |
Start services |
docker logs <container> or docker compose logs <service>docker exec -it <container> bashdocker top <container>docker statsdocker network inspect <network>docker volume inspect <volume>This tutorial includes three hands-on projects that demonstrate different aspects of Docker and common use cases. Each project builds upon the concepts learned in previous sections and introduces new Docker features. These projects can serve as a base template for your own Docker projects.
Location: 1-nginx-website
This project demonstrates how to use pre-built Docker images to quickly deploy a static website using Nginx. It uses pre-built image of Nginx from Docker Hub.
Learning Objectives:
For more details on the project, please refer to Project 1 Documentation
Location: 2-pdf-generator
This project shows how to build a custom Docker image for a Python-based PDF generation service. It demonstrates creating multi-stage builds, handling dependencies, and exposing services through APIs.
Learning Objectives:
For more details on the project, please refer to Project 2 Documentation
Location: 3-text-extractor
This advanced project demonstrates a real-world ML service using Docker Compose to orchestrate multiple containers. It includes a text extraction service, API server, and database, showing how to manage complex multi-container applications.
Learning Objectives:
For more details on the project, please refer to Project 3 Documentation
Docker provides several benefits for Machine Learning projects:
To use GPUs in Docker containers the following approaches can be used:
The NVIDIA Container Toolkit is a collection of libraries and utilities enabling users to build and run GPU-accelerated containers. Follow the instructions on the official guide to install the toolkit.
Hereβs the commands to run a containers with GPU support:
# Check GPU availability
docker run --gpus all nvidia/cuda:11.0-base nvidia-smi
# Run TensorFlow with all GPU support
docker run --gpus all tensorflow/tensorflow:latest-gpu
# Run TensorFlow with specific GPU support
docker run -it --rm --gpus '"device=0,2"' tensorflow/tensorflow:latest-gpu nvidia-smi
Hereβs an example of a Compose file for running a service with access to 1 GPU device:
services:
ml_service:
image: nvidia/cuda:12.3.1-base-ubuntu20.04
command: nvidia-smi
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 1
capabilities: [gpu]
FROM nvidia/cuda:11.6.2-cudnn8-runtime-ubuntu20.04
WORKDIR /app
# Install Python and dependencies
RUN apt-get update && apt-get install -y \
python3 \
python3-pip \
&& rm -rf /var/lib/apt/lists/*
# Install ML libraries
COPY requirements.txt .
RUN pip3 install --no-cache-dir -r requirements.txt
# Copy model and code
COPY . .
# Expose port for API
EXPOSE 8000
# Start the model server
CMD ["python3", "serve.py"]
uv is an extremely fast Python package and project manager written in Rust. It aims to be a modern replacement for tools like pip, pip-tools, poetry, virtualenv and more. With speeds 10-100x faster than pip for dependency installation, it brings modern package management capabilities while maintaining backward compatibility with existing Python tooling.
Key features of uv:
uv is developed by Astral, the creators of Ruff, and represents the next generation of Python tooling focused on performance and developer experience.
| Tool | Pros | Cons |
|---|---|---|
| conda | Environment + package manager, supports non-Python deps | Slow, heavy |
| venv | Built-in, lightweight | Just virtual environments, no package management |
| poetry | Modern dependency resolution, builds packages | Complex, sometimes slow |
| uv | Ultra-fast, Rust-based, compatible with pip | Newer, fewer features |
| Command | Description |
|---|---|
uv venv |
Create virtual environment |
uv pip install <package> |
Install specific Python packages |
uv pip uninstall <package> |
Remove package |
uv pip install -r requirements.txt |
Install packages from requirements file |
uv pip freeze |
Output installed packages |
uv lock |
Generate uv.lock file from pyproject.toml |
uv sync |
Create virtual environment and install all dependencies from pyproject.toml |
# Create new virtual environment
uv venv
# Activate virtual environment
source .venv/bin/activate # Linux/Mac
# Install dependencies
uv pip install -r requirements.txt
Hereβs a more detailed guide for working on projects using uv and pyproject.toml: Working on projects
uv.lock and pyproject.toml files in version controlFROM python:3.9-slim
# Install uv
RUN pip install uv
# Use uv for faster dependency installation
COPY requirements.txt .
RUN uv pip install -r requirements.txt
# ... rest of Dockerfile
Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.
git checkout -b feature/AmazingFeature)git commit -m 'Add some AmazingFeature')git push origin feature/AmazingFeature)This project is licensed under the MIT License - see the LICENSE file for details.