Understanding Layers and Containers

Many have come to associate layers with containers but they are separate technologies. There is some confusion about how layers work and how they are used in some container implementations. So let's use this opportunity to get a clearer perspective.

Aufs and Overlayfs

Layers were first popularized by the open source Aufs and Overlayfs projects. Aufs was developed by Junjiro Okajima in 2006. Overlayfs was developed mainly by Miklos Szeredi around 2011.

The LXC project supported Aufs to run layered containers in 2012. At that time both Aufs and Overlayfs were not in the kernel. LXC being an Ubuntu supported project the Ubuntu kernel at that time had support for Aufs out of the box. Without this you would have to recompile the kernel to add support for Aufs.

While there was talk of imminent merge into the kernel it didn't happen and eventually Aufs was found to be too too complex and instead the Overlayfs project was merged in kernel 3.18.

Docker which was based on the LXC project used the already supported Aufs to build containers. Layering basically places one filesystem on top of another. The lower file system is read only and all writes go the upper filesystem.

Containers and Layers

So if you have a container with an app, let's say a Wordpress container and you decide to launch it with Aufs or Overlayfs then the Wordpress container will be the lower filesystem and read only, and any run time data will go to the new layer. The original Wordpress container remains untouched.

At run time in the launched container due to the use of aufs or overlayfs you get a consolidated view of the upper and lower filesystem and as far as the system is concerned its a single filesystem. You can achieve the exact same effect by using a copy of the container. The important thing to understand is layers only give you disk space savings.

Aufs always supported a large number of layers so you could keep on building layers upon layers. For instance you could launch a Ubuntu OS container, then add PHP in a layer, then stop the container and use the result to start the container in another layer and add nginx in the new layer and so on. All the lower layers remain read only while the upper layer is write.

Layers of Complexity

The problem with layers is things get complex pretty quickly with hard to detect bugs, filesystem incompatibilities and permission problems. For instance if one of the layers make file permission changes to to a folder in /etc this may not always reflect in the upper layers and this creates extremely difficult to find bugs.

While layers are interesting most real world users will find using layers add a lot more complexity than they bargained for in basic container management. File systems are fundamental and you simply cannot afford bugs or flakiness at this layer. The more layers the worse it becomes.

Overselling Layering Benefits

There is a difference between using layers at runtime and building images with layers. If you are just using layers at runtime you can have a library of base images with apps and run these images with aufs or overlayfs at runtime to keep the base images intact with data going to the new layer. The advantage is this limits layering to a single layer.

But using layers to build containers becomes questionable because the benefits are marginal. The idea of using layers to build images and piling layers together and the overhead of managing that is composibility, reuse and some kind of immutability.

Composibility & Reuse

Any assumed benefits around composibility may not pan out, because layers are heavily dependent on each other. Any updates will require a remount or a rebuild. Similarly if there are security updates or OS updates to base OS images or underlying layers any containers built using these will need to be rebuilt. The use of layers here delivers no advantages but adds a lot of management overhead.

You are essentially pulling app images or layers to build a static container. You can do the exact same thing with base os, dev and app images to build apps without using layers and any of the complexity of managing multiple players.

Reuse can be equally achieved with base OS and app images on standard file systems without using layers. A PHP, Ruby, Go container or any other container can be reused as a base for your applications. No layers required.

Any layers do come with downsides. Not being able to build containers as non root users is a direct result of using layers. In Linux only root users can mount filesystems so if you use layers to build containers you will need root permissions. On Flockport for instance you can build on unprivileged containers.

Immutability

Words like immutability are thrown around but fall apart on scrutiny. Containers are mutable at run time. 'Immutability' used in this context is an artifact of the base image in the same way that running a copy of a container leaves the original untouched. A container is simply a folder on your filesystem. If you use a clone of that folder the original folder is 'immutable'. Using layers to suggest some special 'immutability' is completely meaningless.

Beyond all the market communication the important thing the use of layers gives you is disk space savings. Disk space savings are not a good reason to adopt complexity, and with disk space being cheap there is not much incentive. And the bigger issue is things like deduplication, snapshots are best handled at the filesytem level transparently without management overhead. Layers are single host features and are not compatible with any distributed filesystem. Layers should be used judiciously as a choice, something to experiment with, not a debt that is imposed on everyone without understanding the tradeoffs involved.


RELATED POSTS