You’ve built a brilliant AI-powered application in .NET, perhaps using an ONNX model for local inference. It works perfectly on your machine. Now, how do you ship it?
The answer is Docker. Containerization solves the “it works on my machine” problem by packaging your application, its dependencies, your .NET runtime, and even your AI models into a single, isolated unit called an image.
This guide will walk you through creating an optimized, multi-stage Dockerfile for a .NET application that uses the ONNX Runtime.
Why Docker for AI Apps?#
- Environment Consistency: The exact same environment is used for development, testing, and production. No more “missing dependency” errors.
- Dependency Isolation: Your app’s specific version of CUDA, ONNX Runtime, or any other library won’t conflict with other applications on the host machine.
- Scalability: Docker containers can be easily scaled up or down using orchestrators like Kubernetes.
- Portability: A Docker image built on a Windows machine will run flawlessly on a Linux server in the cloud.
The Project Structure#
Let’s assume we have a simple console application with the following structure:
| |
The sentiment-model.onnx file should be marked as Content and set to Copy if newer in the .csproj file:
| |
The Multi-Stage Dockerfile#
A multi-stage build is the best practice for creating lean, secure Docker images. We use one stage (the build stage) to compile the application with the full SDK, and a final stage that only contains the minimal runtime and our application artifacts.
| |
Key Optimizations in This Dockerfile:#
- Multi-Stage Build: The final image doesn’t contain the .NET SDK (which is huge). It only contains the minimal runtime needed to execute the application.
- Layer Caching: By copying the
.csprojfile and runningdotnet restorebefore copying the rest of the code, we ensure that Docker doesn’t need to re-download all the NuGet packages every time we change a single line of C# code. - Missing Dependencies: We explicitly install
libgomp1. This is a classic “gotcha” when moving .NET AI apps from Windows to Linux containers.
Handling Large Models with .dockerignore#
If your AI models are massive (e.g., 5GB+), you don’t want to copy them into the build stage context if they aren’t needed for compilation.
Create a .dockerignore file to exclude heavy assets from the initial build context copy, and then copy them explicitly only where needed.
| |
Then, in your Dockerfile, copy the model directly to the final stage:
| |
Handling GPU-Enabled Models#
Running AI on a CPU is fine for simple tasks, but for heavy lifting, you need a GPU. This complicates things because the standard .NET images don’t include NVIDIA drivers or CUDA libraries.
To support GPUs, you typically need to:
- Use an NVIDIA CUDA base image (e.g.,
nvidia/cuda:12.1-base-ubuntu22.04). - Install the .NET Runtime on top of it.
- Use the
Microsoft.ML.OnnxRuntime.GpuNuGet package instead of the CPU version.
| |
When running the container, you must use the --gpus all flag:
| |
Conclusion#
Docker is an essential tool for modern application deployment, and it’s a perfect match for .NET AI applications. By using a multi-stage Dockerfile, you can create lean, portable, and scalable images that encapsulate your application, your models, and all their dependencies. This consistent and reproducible approach simplifies deployment from your local machine to any cloud provider.
Further Reading#
- Docker Documentation - Official Docker documentation
- .NET Docker Images - Official .NET container images
- ONNX Runtime Docker Images - Pre-built ONNX Runtime containers
- Multi-stage builds - Docker’s guide on multi-stage builds
