Flockport labs - LXC and VXLAN

This is a followup post to the LXC networking deep dive where we covered extending layer 2 across container hosts with Ethernet over GRE and L2tpV3. In this post we experiment with connecting containers across LXC hosts with VXLAN.

The VXLAN standard is new and evolving so this is very much out there. I was pleasantly surprised to discover Linux kernel support for VXLAN for not only multicast but also unicast.

VXLAN is a multicast standard and while Cisco and others have their own unicast extensions these are one offs, so it will be interesting to see how the Linux kernel VXLAN unicast works.

Before we go further its normal to connect containers and VMs across several hosts that may not be in the same network and linked by public networks with layer 3 routing and protocols like GRE, IPSEC VPNs, Tinc and other technologies.

However with layer 3 tunnels the VMs and containers are in different subnets. Layer 2 extension allows you to put them in the same logical subnets across hosts. The use case for this is more flexible VM deployment, VM mobility, graceful VM migration and failover.

VXLAN is an emerging standard for SDNs. It was created mainly by Vmware, Cisco and Arista. It is designed to address the limitations of Vlans in a virtualized world. Vxlan allows you to extend layer 2 across hosts, and is designed to be more scalable than than Ethernet over GRE and L2tpV3 we covered on our earlier LXC networking deep dive.

The most crisp and clear documentation on VXLAN online can be found in this 5 part blog post by Vyenkatesh Deshpande of VMWare.

In brief VXLAN operates on the concept on VTEPs and VXLAN IDs. VMs and container are connected to logical subnets according to VXLAN IDs, and the standard supports up to 16 million different IDs. Networkheresy, Scott Lowe and ipspace are 3 great blogs to stay up to date on VXLAN, SDN and emerging networks.

Extending layer 2 basically allows you to put containers or VM across multiple hosts in the same layer 2 network or subnet. These can stretched be across public networks and datacentres but caveats apply. MTU issues and latency across the layer 2 extension needs to be managed.

You need a fairly recent version of the Linux kernel 3.14 + and the iproute2 package 3.14+ to try VXLAN. This is also new so there is not a lot of information out there.

Multicast is problematic as most cloud, service providers and organizations are reluctant to support it.

Build the VXLAN network

We are using a Debian Wheezy and Ubuntu Trusty host as 2 ends of the VXLAN for this guide. You need a relatively recent kernel 3.14+ and iptroute2 package 3.14+.

We are using 2 endpoints but you can of course use multiple endpoints. The main thing is the VXLAN IDs; VMs and containers connected to the same VXLAN ID will be on the same subnet. The standard allows you to create multiple VXLAN tunnels with different IDs and organize and segment your network to suit your needs.

This is the network topology. Host 1 is on public IP 1.1.1.1 and Host 2 is on public IP 2.2.2.2 connected to the network via eth0. Container A & B is on Host 1 and container C & D is on Host 2. Containers are not connected to any networks currently. We are going to put all 4 containers on the same layer 2 network 10.0.2.0/24 across hosts and start an instance of dnsmasq to provide dhcp to this subnet.

The Linux VXLAN implementation supports both Multicast and Unicast. We will show you how to use both

VXLAN Unicast

First on Host 1 using Unicast

ip link add vxlan0 type vxlan id 42 remote 2.2.2.2 local 1.1.1.1 dev eth0

We are calling our VXLAN interface vxlan0 (this can be anything you want), our VXLAN ID is 42, we are providing the remote and local IP and the interface they are connected over, in this case eth0.

Let's bring the interface up

ip link set up dev vxlan0

Now let's do the same on Host 2

ip link add vxlan0 type vxlan id 42 remote 1.1.1.1 local 2.2.2.2 dev eth0

Everything remains the same except the remote and local IPs. Now bring the new interface up.

ip link set up dev vxlan0

Both our VXLAN tunnel endpoints are up. Let's connect our containers to this tunnel. We need to create a bridge and add the vxlan0 interface and the containers we want to connect on this VXLAN ID to the same bridge.

On both hosts create a bridge and add the vxlan0 interface to it:

brctl addbr superbr0
ip link set up superbr0
brctl addif superbr0 vxlan0

With this the superbr0 bridges on both Hosts are now connected to each other via the VXLAN tunnel. Now all we need to do is configure our containers on both sides to connect to the superbr0 bridge.

Edit the container config file and configure the network section like this. Its fairly standard, we are only adding a bridge name. Of course ensure the MAC address for all 4 containers are different.

lxc.network.type = veth
lxc.network.flags = up
lxc.network.link = superbr0
lxc.network.name = eth0
lxc.network.hwaddr = 00:16:3e:ab:f9:65
lxc.network.mtu = 1500

You can now start the containers and add static IP addresses to test layer 2 connectivity. But let's use dnsmasq on Host 1 to serve the superbr0 interface.

We need to do this on Host 1 only. Incase you don't have dnsmasq, install it and edit /etc/dnsmasq.conf with the values below. Let's use the subnet 10.0.2.0/24 for this exercise.

On Host 1 only

interface=superbr0
listen-address=10.0.2.1
bind-interfaces
dhcp-range=10.0.2.2,10.0.2.254,12h

Now Dnsmasq will not work without an IP on the interface so let's add an IP to the superbr0 interface on Host 1

ip addr add 10.0.2.1/24 dev superbr0

After making the changes start dnsmasq

service dnsmasq restart

Dnsmasq is listening on the superbr0 bridge on Host 1. Containers connecting to the superbr0 interface on Host will will get an IP in the 10.0.2.0/24 range from dnsmasq. Superbr0 bridges on both Hosts are connected with the VXLAN tunnel.

If we have configured VXLAN correctly containers connecting to superbr0 interface on Host 2 will ALSO get an IP from dnsmasq instance on Host 1! Nice.

Now start the 4 containers on both hosts and they should be able to ping each other.

You can get more information on the vxlan0 interface with the ip link utility

ip -d link show vxlan0

Get information of VXLAN tables

bridge fdb show dev vxlan0

You can also configure a destination udp port on both sides using the dstport flag, but ensure you are running 3.16+ of iproute2 on both sides. The IANA standard for VXLAN is 4789, the Linux kernel uses 8472 by default.

ip link add vxlan0 type vxlan id 42 remote 1.1.1.1 local 2.2.2.2 dev eth0 dstport 4789

We have not configured any routing for the 10.0.2.0/24 subnet. You can easily set up a masquerade rule for the 10.0.2.0/24 subnet on superbr0 on Host 1 so all containers can access the Internet, remember this will be through the Host 1 router.

iptables -t nat -A POSTROUTING -s 10.0.2.0/24 ! -d 10.0.2.0/24 -j MASQUERADE

Extending this beyond 2 hosts with unicast doesn't quite work yet. Sure you can create a VXLAN tunnel from host 3 to 1, and containers both sides will be able to ping each other, but containers in host 2 can't reach host 3 and vice versa. A lot of the tools and pieces are yet to fall in place. To scale beyond 2 hosts you need some sort of discovery mechanism for vteps. See the details of Cisco's implementation here

Cumulus Linux along with Metacloud, recently acquired by Cisco, have been working on VXLAN unicast and recently released the VXFLD project with some tools that may help address this.

There is a proposed patch for the kernel vxlan driver to support multiple unicast destinations to solve the unicast issue. Though I am not sure whether this patch has made it, and how one can use it.

Until VXLAN becomes more robust using Ethernet over GRE or L2TPV3 to extend layer 2 across hosts are excellent options.

VXLAN Multicast

VXLAN is actually designed with multicast in mind. The thing is most organizations are reluctant to enable multicast and the large majority of networks don't support multicast. Of course in your own environment its up to you.

This is how you would use VXLAN multicast. Only the VXLAN tunnel creating process will differ. Everything else remains the same.

ip link add vxlan0 type vxlan id 42 group 239.1.1.1 dev eth0

Notice the 'group 239.1.1.1'. This is the multicast IP. The multicast address is used for discovery and populating the VXLAN Vtep tables. See this excellent link on how it works

Multicast works beyond 2 hosts as it's designed to. We used the Linux bridge for this exercise. You can also use OVS (OpenVSwitch) to create VXLAN tunnels

Stay updated on Flockport news

Recommended Posts

Leave a Comment

Login

Register | Lost your password?