The new Docker 1.10 was released last week and it was a big event in the community. Besides fixing many bugs and sharpening the saw, Docker 1.10 introduced several major improvements and features we were waiting for. In this post I would like to highlight the features that I personally consider the most useful and most exciting.
Docker Compose with support for Networks and Volumes
Together with Docker 1.10, Docker Compose 1.6 was released. It was updated with support for networks and volumes as top level entities, after the completely redesigned volume and networking systems were introduced in Docker 1.9.
The newest version of Docker Compose is backwards compatible with the old file format. To migrate to Version 2 of Docker Compose file format, add
version: ‘2’ line to the top, move all your existing services definitions under new
services: section and optionally create
networks: sections for the new stuff.
With these latest additions people can define more complex network and storage setups in
docker-compose.yml files, develop applications in these environments, take the very same setups to their CI, testing and eventually production environments using only Docker Compose and Docker Swarm. That’s amazing!
A small but useful improvement in
docker-compose.yml semantics is that you can now define both
image name and
build directory for a container and save some time by pulling an image, if it exists, rather than rebuilding it on every occasion.
Networking related goodies
More networking related features were added in this release. Check them out:
- Internal networks, that can be used to create networks with restricted in/out traffic
- Containers can be assigned with custom IP addresses
- Links between containers can be used in all types of networks, not only the default bridge network
- A built-in DNS server is now used by default instead of the previously used solution relying on
/etc/hostsfiles – yes, DNS scales better
New security features in Docker 1.10
Several key security improvements were added as part of Docker 1.10 release, with Seccomp profiles (Linux syscalls filtering), content addressable image ID’s and user namespacing being the most significant additions.
User Namespaces in Docker
Linux namespaces (UTS, PID, NET, MNT, IPC) are one of the concepts that allow us to create what we call Linux containers. They allow us to call different things the same name in different contexts. Much like namespaces in programming languages. User namespace (abbreviated as USER) is the latest containment namespace added to Linux kernel.
Before the introduction of user namespaces to Docker, root user in a container had the same UID as the root on the host system and possibly in other containers. This obviously was a security issue. With user namespaces the containers are provided with UID and GID mappings, that allow processes in the containers think that they run for instance as UID 0 (commonly the root user), while actually they run as UID 1234, or 42, or some other UID. This means they have root access from the perspective of the container, but not from the perspective of the host system.
Hackers will have harder times (and possibly more fun) breaking in and out of containers again.
Currently the UID and GID mappings are specified at the Docker daemon level (using
--userns-remap flag). However, in the future we can expect implementation efforts to allow different mappings set per container, that would enable secure multi-tenancy on a single Docker host.
Secure Computing Mode in Docker
Secure computing mode or shortly seccomp is a sandboxing mechanism of Linux kernel. It has been in the kernel since version 2.6.12, so it’s not a new concept at all.
Seccomp essentially allows users to filter system calls available to processes. It can dramatically lower the attack surface by only enabling those system calls that the application/process really needs for its function to be used. Does your average PHP application need to have access to all those over 300 available system calls? It probably doesn't.
With Docker 1.10 users can either use the sane default seccomp profile, define their own in a JSON file and pass it to a container using the
–security-opt parameter, or run containers without any seccomp profile (not recommended).
Content Addressable IDs
Each Docker image represents a stack of image layers. These image layers are snapshots of the image’s filesystem, as they existed after execution of each command from the original Dockerfile during the
docker build process. These layers, either pulled from a Docker registry or built locally, are read-only objects. Adding a read/write layer on top of them creates a container.
In previous versions of Docker, image layers (as well as containers) were identified by random UUIDs. Starting from Docker 1.10 image layers are identified by content hashes. This makes integrity verification of images after pulling, pushing, loading and saving easier and more transparent, because there's no such thing as two different layers having accidentally the same ID or vice versa.
Existing images created by previous versions of Docker have to be migrated. Docker daemon automatically migrates all images present on the host after upgrading to version 1.10. This can take up to several minutes, if there are many images to be hashed. If you can’t afford such a timeout, there’s a script for offline migration.
Note: Container IDs are still random UUID in Docker 1.10.
Even more updates
There are more goodies in this release. The
docker update command has been added, which allows updating resource constraints of running containers. Docker daemon doesn’t have to be restarted after changes to it’s config file. Resource constraints can be applied on disk I/O. The download/upload manager was refactored and enables faster pulls and parallel pushes. And so on ...
Docker 1.10 is a remarkable release. I have migrated all my hosts a week ago and I’m not experiencing any issues (like I was last time after upgrading to 1.9). I encourage everyone to consider upgrading.