Understanding the key differences between LXC and Docker

This is a 2 part series exploring Linux containers, container managers like LXC and Docker and the potential of containers as lightweight alternatives to virtualization

Linux containers has the potential to transform how we run and scale applications. Container technology is not new, mainstream support in the vanilla kernel however is, paving the way for widespread adoption.

FreeBSD has Jails, Solaris has Zones and there are other container technologies like OpenVZ and Linux VServer that depend on custom kernels impeding adoption.

There is an excellent presentation on the history and current state of Linux containers by Dr. Rami Rosen which provides fantastic perspective and context.

What's the fuss?
Containers isolate and encapsulate your application workloads from the host system. Think of a container as an OS within your host OS in which you can install and run applications, and for all practical purposes behaves like a virtual machine.

This emulation is enabled by the Linux kernel and the LXC project which provides minimal container OS templates for various distributions and userland applications for container life cycle management.

Portability
Containers decouple your applications from the host OS, abstracts it and makes it portable across any system that supports LXC. That this is useful would be a wild understatement. Users can have a clean and minimal base Linux OS and run everything else let's say a lamp stack in a container.

Because apps and workloads are isolated users can run multiple versions of PHP, Python, Ruby, Apache happily coexisting tucked away in their containers, and get cloud like flexibility of instances and workloads that can be easily moved across systems and cloned, backed up and deployed rapidly.

But doesn't virtualization do this?
Yes, but at a performance penalty and without the same flexibility. Containers do not emulate a hardware layer and use cgroups and namespaces in the Linux kernel to create lightweight virtualized OS environments with near bare-metal speeds. Since you are not virtualizing storage a container doesn't care about underlying storage or file systems and simply operates wherever you put it.

This fundamentally changes the way we virtualize workloads and application, as containers are simply faster, more portable and can scale more efficiently than hardware virtualization, with the exception of workloads that require an OS other than Linux or a specific Linux kernel version.

Is it game over for VMWare then?
Not so fast! Virtualization is mature with extensive tooling and ecosystems to support its deployment across various environments. And for workloads that require a non Linux OS or a specific kernel virtualization remains the only way.

LXC
LXC owes its origin to the development of cgroups and namespaces in the Linux kernel to support lightweight virtualized OS environments (containers) and some early work by Daniel Lezcano and Serge Hallyn dating from 2009 at IBM

The LXC Project provides tools to manage containers, advanced networking and storage support and a wide choice of minimal container OS templates. It is currently led by a 2 member team, Stephane Graber and Serge Hallyn from Ubuntu. The LXC project is supported by Ubuntu.

LXC is actively developed but not well documented beyond Ubuntu. Cross distribution documentation is lacking, things usually work well in Ubuntu first, leaving to all round frustration and hair pulling for users of other distributions.

There is a lot of confusion, outdated and often just misleading information online. Add Docker to the mix which has aggressively marketed itself to the wider community (Ubuntu, why so quiet?) and the volume of information and scope for confusion has widened.

To clarify all the misconceptions both LXC and Docker are userland container managers that use kernel namespaces to provide end user containers. We also now have Systemd-Nspawn that does the same thing. The only difference is LXC containers have an an init and can thus run multiple processes and Docker containers do not have an init and can only run single processes.

LXC maintainer Stephane Graber's excellent 10 part Blog series on LXC 1.0 and our LXC Getting started guide provide an overview of LXC and its capabilities.

How they differ
The idea behind Docker is to reduce a container as much as possible to a single process and then manage that through Docker. The main problem with this approach is you can't wish the OS away as the vast majority of apps and tools expect a multi process environment and support for things like cron, logging, ssh, daemons. With Docker since you have none of this you have do everything via Docker from basic app configuration to deployment, networking, storage and orchestration.

LXC sidesteps that with a normal OS environment and is thus immediately and cleanly compatible all the apps and tools and any management and orchestration layers and be a drop in replacement for VMs. 

Beyond that Docker uses layers and disables storage persistence. LXC supports layers with aufs, overlayfs and has wide support for COW cloning and snapshots with btrfs, ZFS, LVM Thin and leaves it to user choice. Separating storage in LXC containers is a simple bind mount to the host or another container for users who choose to do so.

Both Docker and LXC set up a default NAT network. Docker additionally setups port forwarding to the host with the -p flag for instance '-p 80:80' forwards 80 from the host to the container. Given NAT containers can be accessed directly by the local host by their IPs and only need port forwarding when consumed by external services which can be done simply by an iptables rule when required, the reason for doing this is not very clear.

To compound matters Docker gives you very little control of the IP and hosts file and you can't set static IPs for containers which makes assigning services to IPs a bit of a conundrum.  You need to use a '--links' flag to connect containers which adds an entry in the the /etc/hosts file of the linked container. The need to abstract away basic networking in this way seems a bit pointless and adds a needless layer of complexity.

With LXC it's a much simpler to assign static IPs, routable IPs, use multiple network devices, manage the /etc/hosts file and basically use the full stack of Linux network capabilities with no limitations. Want to connect containers across hosts? Users can setup quick overlays using GRE, L2TPV3 or VXLAN tunnels or any networking technology they are using currently. LXC containers will work for whatever works for VMs seamlessly.

Docker
Docker is a project by dotCloud now Docker Inc released in March 2013, initially based on the LXC project to build single application containers. Docker has now developed their own implementation libcontainer that uses kernel namespaces and cgroups directly.

Layered containers
Docker was initially based off LXC's support for Aufs to build layered containers, and is now adding support to Btrfs, device mapper and Overlayfs as Aufs may not be merged into the kernel.

A docker container is made up of a base image plus layers which when committed becomes Docker images. When you run an image a copy is launched as a container and any data is transient untill 'commited'. Every commit is a separate image so you can start off from that.

We have a guide on using Overlayfs with LXC which should gives users an idea of how layers work. With union filesystems like Aufs or Overlayfs (they differ in their implementation, performance and support for the number of lower layers) the lower layer/s is read only and the upper layer is rw at run time. In the context of containers the lower layer is usually but not necessarily the base OS and the upper layers the changes you make.

While the idea of layers sounds good, layering file systems is still immature technology and there is an inherent complexity and performance penalty using layers. And a real risk of getting bogged down in layers.

Single application containers
Docker restricts the container to a single process only. The default docker baseimage OS template is not designed to support multiple applications, processes or services like init, cron, syslog, ssh etc.

As we saw earlier this introduces a certain amount of complexity for day to day usage scenarios. Since current architectures, applications and services are designed to operate in normal multi process OS environments you would need to find a Docker way to do things or use tools that support Docker.

Take a simple application like WordPress.  You would need to build 3 containers that consume services from each other. A PHP container, an Nginx container and a MySQL container plus 2 separate containers for persistent data for the Mysql DB and WordPress files. Then configure the WordPress files to be available to both the PHP-FPM and Nginx containers with the right permissions, and to make things more exciting figure out a way to make these talk to each other over the local network, without proper control of networking with randomly assigned IPs by the Docker daemon! And we have not yet figured cron and email that WordPress needs for account management. Phew!

This is a can of worms and a recipe for brittleness. This is a lot of work that you would just not have to even think about with OS containers. This adds an unbelievable amount of complexity and fragility to basic deployment and now with hacks, workarounds and entire layers being developed to manage this complexity. This cannot be the most efficient way to use containers.

Can you build all 3 in one container? You can, but then why not just simply use LXC which is designed for multi processes and is simpler to use.  To run multiple processes in Docker you need a shell script or a separate process manager like runit or supervisor. But this is considered an 'anti-pattern' by the Docker ecosystem and the whole architecture of Docker is built around single process containers.

Persistent data
Docker separates container storage from the application, you mount persistent data with bind mounts to the host (data volumes) or bind mounts to containers (data volume containers)

This is one of the most baffling decisions, by bind mounting data to the host you are eliminating one of the biggest features of containers for end users; easy mobility of containers across hosts. Probably as a concession Docker gives you data volumes, which is a bind mount to a normal container and is portable but this is yet another additional layer of complexity, and reflects just how much Docker is driven by the PAAS provider use case of app instances.

Restricting this provides little benefit and creates a needless problem for the end user to solve, how to manage their persistent app and user data. Installing and configuring even a simple database like Mysql is non trivial. And you need another set of companies to come in and solve this problem or get your hands dirty. Complexity is often equal to brittleness. Unless your use case is only containers with non persistent data this has the potential to make Docker containers less portable.

Registry
Docker provides a public and private registry where users can push and pull images from. This is similar to the Flockport app store that provides ready to use containers. This makes it easy for users to share and distribute applications.

Dockerfile
Docker file is a script to tell Docker to build a container from an image and with a particular app. It's similar to using a bash script to create an LXC container with particular apps installed.

Widening gap with LXC
LXC features need to be reimplemented by the docker team to be available in Docker, for instance LXC now supports unprivileged containers which let's non root users create and deploy containers, and is working on live migration and multi-host management. These are big steps forward for containers and pave the way for better security, multi-tenant workloads and virtualization parity.

Docker does not support these yet. And with the recent announcement of libcontainer the capabilities of the 2 will keep presumably grow apart.

There is no right or wrong way to run containers. It's up to users, the docker approach is unique and will necessitate custom approaches at every stage to find the docker way to accomplish tasks from installing and running applications to scaling containers.

Just taking one example, as you scale you run agents or SSH to monitor health, run checks, respond to events, orchestrate actions, have logs, update configuration on the fly, eg security updates. You can't run agents, have logs or even ssh into docker containers which would become a second application in the container.

Configuration changes to system files like /etc/hosts are involved because there is no concept of an OS with app containers, and application configuration changes need to be written to a new layer and committed.

If you are trying to build a single app containers for a PAAS centric scenario as Docker is designed for, the tradeoff in complexity may make sense. For other use cases these limitations can be a significant challenge to manage and begin to make less sense.

The way ahead
Virtualization enabled the cloud by allowing OS and apps to be frozen in state transforming them into instances that are decoupled from the hardware and OS and can be easily moved around. OS containers add more even speed, flexibility and portability expanding possibilities.

Docker has done an excellent job in wrapping containers and overlay file systems into a developer friendly model with the dockerfile and commits. Its only when you go beyond a single laptop that the big problems of scale like management, monitoring, storage and networking begins to make this model extremely complex and brittle.

OS containers on the other hand by being similar to VMs with OS level capabilities make it much more easier to integrate to normal and distributed systems with the current tooling and without the need for any separate tooling to be developed.

Docker is a VC backed company and could market itself aggressively. The result is a lot of users have heard of containers in the context of Docker and do not know about OS containers and that they are simpler to use. We see users struggling with single app containers, layers and the lack of storage persistence when they are simply looking to run a container as a lightweight VM.

Many large scale users and enterprises will simply not consider anything that requires extra engineering effort to transition their workloads from VMs and is incompatible with the rest of their infrastructure in terms of networking, storage and management.

This is where LXC comes into its own as its not opinionated, offers all the compelling advantages of containers and a seamless transition from VMs without the need to re architect who you deploy your apps which is an incredible value proposition.

Here is a sneak peek of LXC

Part II - Sleepwalking into a monoculture and Lock-in with Linux Containers

All efforts have been made to ensure the article is accurate. Please let us know in the comments section, or contact us if you feel there are any inaccuracies.

Stay Updated on Flockport News

Recommended Posts
Comments

Leave a Comment

Login

Register | Lost your password?