- Use one
RUNto prepare, configure, make, install and cleanup.
- Cleanup with:
apt-get remove --purge -y $BUILD_PACKAGES $(apt-mark showauto) && rm -rf /var/lib/apt/lists/*
I've been packaging the nghttp2
HTTP/2.0 proxy and client by
Tatsuhiro Tsujikawa in both Debian
docker and noticed it takes some time to get the build
dependencies (C++ cough) as well as to do the build.
In the Debian packaging case its easy to create minimal dependencies
pbuilder and ensure the binary package contains only the
right files. See
docker, since you work with containers it's harder to see what
changed, but you still really want the containers as small as
possible since you have to download them to run the app, as well as
the disk use. While doing this I kept seeing huge images (480 MB),
way larger than the base image I was using (123 MB) and it didn't
make sense since I was just packaging a few binaries with some small
files, plus their dependencies. My estimate was that it should be
way less than 100 MB delta.
I poured over multiple blog posts about Docker images and how to make them small. I even looked at some of the squashing commands like docker-squash that involved import and export, but those seemed not quite the right approach.
It took me a while to really understand that each
command creates a new container with the deltas. So when you see
all those downloaded layers in a
docker pull of an image, it
sometimes is a lot of data which is mostly unused.
So if you want to make it small, you need to make each
command touch the smallest amount of files and use a standard
image, so most people do not have to download your custom l33t base.
It doesn't matter if you
rm -rf the files in a later command; they
continue exist in some intermediate layer container.
So: prepare configure, build, make install and cleanup in one
command if you can. If the lines get too long, put the steps in
separate scripts and call them.
Lots of Docker images are based on Debian images because they are a
small and practical base. The
debian:jessie image is smaller than
the Ubuntu (and CentOS) images. I haven't checked out the fancy
'cloud' images too much:
Ubuntu Cloud Images,
Snappy Ubuntu Core,
Project Atomic, ...
Dockerfile building from some downloaded package, you
curl and maybe
git. When you install,
ca-certificates to get TLS/SSL certificates,
it pulls in a lot of extra packages, such as
openssl in the standard
Debian curl build.
You are pretty unlikely to need curl or git after the build stage of your package. So if you don't need them, you could - and you should - remove them, but that's one of the tricky parts.
$BUILD_PACKAGES contains the list of build dependency packages
such as e.g.
libxml2-dev and so on, you would think that this would
get you back to the start state:
$ apt-get install -y $BUILD_PACKAGES $ apt-get remove -y $BUILD_PACKAGES
However this isn't enough; you missed out those dependencies that got automatically installed and their dependencies.
You could try
$ apt-get autoremove -y
but that also doesn't grab them all. It's not clear why to me at
this point. What you actually need is to remove all autoadded
packages, which you can find with
So what you really need is
$ AUTO_ADDED_PACKAGES=`apt-mark showauto` $ apt-get remove --purge -y $BUILD_PACKAGES $AUTO_ADDED_PACKAGES
--purge too since we don't need any config files in
for build packages we aren't using.
Having done that, you might have removed some runtime package dependencies of something you built. That's harder to automatically find, so you'll have to list and install those by hand
$ RUNTIME_PACKAGES="...." $ apt-get install -y $RUNTIME_PACKAGES
Finally you need to cleanup apt which you should do with
rm -rf /var/lib/apt/lists/* which is great and removes all the index
apt-get update installed. This is in many best practice
documents and example Dockerfiles.
You could add
apt-get clean which removes any cached downloaded
packages, but that's not needed in the official Docker images of
debian and ubuntu since the cached package archive feature is
Finally don't forget to delete your build tree and do it in the
RUN that you did a compile, so the tree never creates a new
container. This might not make sense for some languages where you
work from inside the extracted tree; but why not delete the src dirs?
Definitely delete the tarball!
This is the delta for what I was working on with
479.7 MB separate prepare, build, cleanup 3x RUNs 186.8 MB prepare, build and cleanup in one RUN 149.8 MB after using apt-mark showauto in cleanup
You can use
docker history IMAGE to see the detailed
horror (edited for width):
... /bin/sh -c /build/cleanup-nghttp2.sh && rm -r 7.595 MB ... /bin/sh -c cd /build/nghttp2 && make install 76.92 MB ... /bin/sh -c /build/prepare-nghttp2.sh 272.4 MB
and the smallest version:
... /bin/sh -c /build/prepare-nghttp2.sh && 27.05 MB
The massive difference is the source tree and the 232 MB of build
apt-get pulls in. If you don't clean all that up
before the end of the
RUN you end up with a huge transient layer.
The final size of 149.8 MB compared to the 122.8 MB debian/jessie base image size is a delta of 27 MB which for a few servers, a client and their libraries sounds great! I probably could get it down a little more if I just installed the binaries. The runtime libraries I use are 5.9 MB.
... and of course this HTTP/2 setup is used on this blog!
- flatten images - merge multiple layers into a single one #332
The open issue from April 2013 on about flattening images aka
docker squashidea. TL;DR: It's hard.
- Flat Docker images
by Maciej Pasternacki. Describes a script to compress
- Create The Smallest Possible Docker Container by Adriaan de Jonge Docker within Docker and static go binaries and tiny images.
- Optimizing Docker Images
by Brian DeHamer. Where I learnt of
docker history IMAGE
- Squashing Docker Images
by Jason Wilder. Introduces
docker-squashGo application. docker-squash on GitHub
- Docker official best practices
- Dockerfile Best Practices - take 2 by Michael Crosby. Linked from the official page.
- Official Docker images bootstrap script It includes multiple fixes including for initrd, dpkg apt speedup and preventing services starting by policy-rc.d and/or upstart. It makes apt-get "effectively running apt-get clean after every install" as well as disable the package caching.
- phusion image base script The base preparation of phusion's images (which are quite opinionated) is also interesting if you want to find out how to set base languages and locales