Slimming a Docker container from nearly 8gb to under 200mb

Web developers are no strangers to bloated apps or gigantic Docker containers. While many apps can put on impressive gains just by uttering the magic words "node_modules", even a relatively simple dotnet app can get bulky once the dotnet SDK and package dependencies have been installed.

Such was the case with the container running this very website, which is a custom F# program (another .NET language, sibling to C#) with zero node dependencies except for the Stylus and TypeScript compilers. This site doesn't even have a package.json file, and no dependencies get installed to a node_modules folder. It's very close to pure .NET, and the Docker container that hosted the site was nearly 8gb once built.

Amazingly I didn't even realize this site's container was so big until I tried to move the site from a custom VM to an Azure Container Web App. The size was so great that the container literally wouldn't work on Azure's Basic B1 plan -- the cheapest plan available, a single step above free. My Azure compute instance just couldn't pull/extract/run the container at all, and would only return 502 Service Unavailable errors.

It turns out that I had made a few "rookie" mistakes which caused my container's size to bloat so badly. I put "rookie" in quotes because most of these are things that I knew I should be doing but I took the lazy route instead. I didn't think it would matter for a small site like this because, after all, there's no way a nearly-static site could be more than a gigabyte or two once placed in a container, right?

But in hindsight, those rookie mistakes did matter and I should have put a little bit more effort into my Dockerfile on the first pass. There were three easy changes that I made to massively reduce the size of the site's container from almost eight gigabytes to less than two-hundred megabytes:

  1. Use multi-stage builds. My Docker container was using the fsharp:netcore image, which by itself is pretty big, and then I installed the NodeJS runtime (and all of the underlying packages it relies on), plus my Stylus and TypeScript compilers. By using multi-stage builds, you only need to bring in bits like the F# compiler and Node for the steps that you need them, and then you can drop them and copy things over to a slimmer image.
  2. Remove superfluous files and packages once the source code has been compiled. While the source code files themselves aren't that big, the .NET packages folder for this website was over two gigabytes large itself -- and that's for a website with only four .NET dependencies.
  3. Use Alpine as the final runtime image. This is partially related to #1, but using Alpine in particular is a huge win for any containerized app since it weighs less than six megabytes total! That's a massive amount of space saved just from using that particular image as the container's final runtime image. However, that fat was trimmed from somewhere, which means Alpine only has the bare necessities needed to run itself, and doesn't come with common binaries you might find in a regular Ubuntu image.

An example

Let's take a look at a quick example where we can run through each of the steps outlined above. Below is the exact same Dockerfile that I was using for this website before I slimmed it down:

FROM fsharp:netcore
WORKDIR /app

# install utils
RUN apt update
RUN apt install xz-utils -y

# nodejs
# https://nodejs.org/en/download/
RUN mkdir -p /opt/nodejs && curl -sL https://nodejs.org/dist/v10.15.0/node-v10.15.0-linux-x64.tar.xz | tar xJf - -C /opt/nodejs --strip-components=1
ENV PATH $PATH:/opt/nodejs/bin
RUN node --version

# yarn
# https://yarnpkg.com/en/docs/install#alternatives-stable
RUN mkdir -p /opt/yarn && curl -sL https://yarnpkg.com/latest.tar.gz | tar xzf - -C /opt/yarn --strip-components=1
ENV PATH $PATH:/opt/yarn/bin
RUN yarn --version

# install TS and Stylus
RUN yarn global add typescript stylus

# Restore packages
COPY paket.lock paket.dependencies ./
COPY ./.paket ./.paket
RUN mono .paket/paket.bootstrapper.exe
RUN mono .paket/paket.exe restore
COPY ./restore-frontend.fsx .
RUN fsharpi restore-frontend.fsx

# Copy everything else and build
COPY . .
RUN stylus -c ./src/public/css -o ./src/public/css
RUN tsc -p .
RUN dotnet publish -c Release -o published -r linux-x64

EXPOSE 3000
CMD [ "./src/published/nozzlegear.com" ]

And that Dockerfile resulted in a container that was 7.61gb:

The original Docker container was 7.61gb before size reductions

As you can see in the Dockerfile, there was no package.json, and the only reason Node and Yarn were installed was to install the TypeScript and Stylus compilers. The site was (and is) using a .NET package manager called Paket to restore the packages for the website itself. Paket is very similar to Nuget, except that it supports lock files at a time when the Nuget CLI does not (although lock files are finally coming to Nuget soon).

The .NET packages restored by Paket were easily the biggest storage hogs besides the container image itself. Once restored, the packages folder weighed in at a hefty 2gb. What's even more surprising is that these two gigabytes worth of packages were installed for a website that only has four dependencies: FSharp.Core, Microsoft.Fsharplu.Json (a lightweight JSON parser for F#), Suave (a lightweight web framework like ASP.NET or Nancy), and Markdig (a package to convert Markdown to HTML).

How those four dependencies translate to two gigabytes I'll never know, as I was even having Paket specifically install packages for only the netstandard1.0 framework (where usually it would install packages for all framework versions by default to make framework switching faster). I know Node has major package bloat issues with the node_modules folder, but this two gigabyte packages folder easily blows away even the biggest Node projects I've built.

Regardless, the build process for this Docker container went like this:

  1. Pull in an already big fsharp:netcore base image and add Node/Yarn to it.
  2. Install the TypeScript and Stylus compilers.
  3. Copy all files and folders from the source directory and restore the .NET packages using Paket.
  4. Compile the Stylus and TypeScript files.
  5. Call dotnet publish on the website project, which compiles the program and bundles it up with all of its dependencies, dropping them into a single folder.

Publishing the .NET project meant the only things the website needed to run were all contained in the output folder. The packages that get restored by Paket (or even Nuget) are no longer needed after that point and only serve as dead weight. Additionally, all of the C#, TypeScript and Stylus source files were dead weight too. They'd already been compiled into the program, JS and CSS files, respectively. While these files are nowhere near the size of the .NET packages folder, they still aren't needed for running the website itself, and they serve no further purpose.

Adding multi-stage builds

The biggest improvement that can be made to this Dockerfile (and most other Dockerfiles) is using multi-stage builds and swapping to a thin Alpine image at the end. Multi-stage builds also comes with the side benefit that we don't need to install Node and Yarn -- we can just swap to the official Node image when its needed.

I like to organize my Docker build scripts from slowest tasks to fastest, which takes advantage of the Docker caching system to reuse build steps if none of the files have changed. In this case, the .NET/Paket restore and publish process is the slowest part of the build process, so that will go first. If changes are made to the TypeScript/Stylus files, but no changes are made to the F# files, then the Dockerfile will reuse the cached .NET restore/build/publish steps and only recompile the frontend files, saving a decent chunk of time.

In bigger web apps, it's conceivable that the Webpack compilation process is slower and you might want that part to go first.

Using multi-stage builds is actually super simple. Every Dockerfile must start out with the from IMAGENAME to select the starting image, and to use multi-stage builds all you have to do is add more of those where you need them. The images I'm going to use are the fsharp:netcore image to build the F# app, then I'll switch to the nodejs:10 image to install/run the TypeScript and Stylus compilers, and then finally I'll switch to microsoft/dotnet:2.2-runtime-alpine for running the web server itself.

Because Docker actually switches to a fresh container each time you swap to a different image, the files will need to be copied over from the various stages, and the WORKDIR will always need to be set after swapping too.

With those changes made, this is what the Dockerfile should look like:

FROM fsharp:netcore
WORKDIR /app

# Restore dotnet packages
COPY paket.lock paket.dependencies ./
COPY ./.paket ./.paket
RUN mono .paket/paket.bootstrapper.exe
RUN mono .paket/paket.exe restore
COPY ./restore-frontend.fsx .
RUN fsharpi restore-frontend.fsx

# Copy everything from the project directory and build
COPY . .
RUN dotnet publish -c Release -o published -r linux-musl-x64

# Switch to node for frontend items
FROM node:10
WORKDIR /app

# install TS and Stylus
RUN yarn global add typescript stylus

# Copy files from F# image's /app to this image's /app, then build
COPY --from=0 /app/ /app/
RUN stylus -c ./src/public/css -o ./src/public/css
RUN tsc -p .

# Switch to alpine
FROM microsoft/dotnet:2.2-runtime-alpine
WORKDIR /app

# Copy the built files from the last image's /app to this image's /app
COPY --from=1 /app /app
RUN chmod +x ./published/nozzlegear.com

EXPOSE 3000
CMD [ "/app/published/nozzlegear.com" ]

One thing you might notice in this Dockerfile is that I actually change the target runtime of the dotnet publish command from linux-x64 to linux-musl-x64. This one took me a few minutes to puzzle out, but it turns out that you can't publish your .NET project for Linux x64 and expect it to work in an Alpine container. These are two different runtimes, so to get your .NET project working in an Alpine Docker container, you need to target linux-musl-x64.

After building this container with docker build -t myapp . we get 63% reduction in total size from 7.61gb to 2.75gb!

Container reduced from 7.61gb to 2.75gb

There's still another improvement to be made, though, and that's to only copy over the files that are absolutely necessary for the app to run -- that means removing packages, source code files, and little extra things like lock files and READMEs. It won't be quite as drastic as switching to Alpine for the final image, but (in my case) removing just the packages folder frees up an additional two gigabytes.

(You'll see that I'm also copying over a "posts" folder, which is just a folder that contains all of the posts on this website in Markdown format.)

FROM fsharp:netcore
WORKDIR /app

# Restore dotnet packages
COPY paket.lock paket.dependencies ./
COPY ./.paket ./.paket
RUN mono .paket/paket.bootstrapper.exe
RUN mono .paket/paket.exe restore
COPY ./restore-frontend.fsx .
RUN fsharpi restore-frontend.fsx

# Copy everything and build
COPY . .
RUN dotnet publish -c Release -o published -r linux-musl-x64

# Switch to node for frontend items
FROM node:10
WORKDIR /app

# install TS and Stylus
RUN yarn global add typescript stylus

# Copy frontend files and build
COPY --from=0 /app/src/public ./src/public
COPY --from=0 /app/tsconfig.json .
RUN stylus -c ./src/public/css -o ./src/public/css
RUN tsc -p .

# Switch to alpine
FROM microsoft/dotnet:2.2-runtime-alpine
WORKDIR /app

# Copy the built files from both fsharp and node
COPY --from=0 /app/src/published ./published
COPY --from=1 /app/src/public ./public
COPY ./src/posts ./posts
RUN chmod +x ./published/nozzlegear.com

EXPOSE 3000
CMD [ "/app/published/nozzlegear.com" ]

Finally, after one more docker build -t myapp . we end up with a container that weighs less than 150MB:

Container size reduced to 142mb

Some simple improvements to the Dockerfile have drastically reduced the size of the container by 98. That's a huge win for something that doesn't even change the compilation process of the website itself.


Learn how to build rock solid Shopify apps with C# and ASP.NET!

Did you enjoy this article? I wrote a premium course for C# and ASP.NET developers, and it's all about building rock-solid Shopify apps from day one.

Enter your email here and I'll send you a free sample from The Shopify Development Handbook. It'll help you get started with integrating your users' Shopify stores and charging them with the Shopify billing API.

We won't send you spam. Unsubscribe at any time.