Overview

Distributed storage let's us share storage across servers and keeps you data across servers in sync.

In the simplest case NFS allows us to share folders on servers with other servers in the network.

With more advanced solutions we create storage pools across servers that stay in sync with each other and share these pools across servers.

Flockport lets you build distributed storage with NFS, Gluster and MFS.

NFS

NFS is a widely used and proven solution. It's also the simplest to use. Conceptually you export a folder on a server and then share the exported folder with other servers

For this example we are going to use 3 servers and share a folder. Our servers are s1, s2 and s3 and s1 will be sharing a folder with s2 and s3.

To create a share simply use the addshare command

flockport addshare web s1 /var/www/sites

This will share /var/www/web on server s1. 'web' is the name of the share. You can choose your own name.

We can now share web with other servers on the network

flockport share web s2 s3

This shares the web share with servers s2 and s3. You can add some files to /var/www/sites on each server and see they will be replicated.

We weren't joking when we said let's make things simple. Flockport really does make thing simple.

Gluster

Gluster uses the concept of bricks, volumes and replication strategies.

Bricks are the bases units that exist on servers, they compose to form a volume which can then be shared.

We don't like flakiness. With Gluster there are some inherent issues with incompatability between versions. Add to this the reality of different supported versions on different distributions and this makes reliability and robustness across distributions difficult to achieve.

We have tried our best to simplify automating creating Gluster shares but some rough edges remain.

Gluster's performance on small files is not great but it comes into its own with large media files spread across a large number of servers.

For distributed storage Flockport uses the concept of pools. You create storage pools across servers and then share it. The servers must be in the same subnet and have direct network connectivity to each other.

A Gluster storage pool must be created in multiples of 2. And before creating a Gluster pool you need to link hosts that make up the pool as the servers must be able to reach each other by their hostnames.

We are going to use an example with 3 servers. The servers will be s1, s2 and s3. s1 an s2 will form the Gluster storage pool and the pool will be shared with s3

First let's link s1 and s2 with the linkhosts command.

flockport linkhosts s1 s2

This will add entries in s1 and s2's /etc/hosts for each other.

When linking hosts you can specify the server IPs if you want the storage network to operate over a specific network.

flockport linkhosts s1:192.168.122.10 s2:192.168.122.11

This will add entries in s1 and s2's /etc/hosts with the specified IPs.

You can create a storage pool with the createpool command.

flockport createpool gvol gluster

This creates a new gvol storage pool with a default gluster configuration. You can use your own name.

You can now add peers to the storage pool with the addpool command and keyword peers. Peers are the servers hosting the storage pool.

flockport addpool gvol peers s1 s2

This will add servers s1 and s2 to the gvol gluster storage pool

This process may take a few minutes as Gluster is installed and the storage pool is created and initiated.

Once the storage pool is up you can add clients to it with the addpool command and keyword client.

flockport addpool gvol client s3

This shares the gvol storage pool with server s3. Now any files and folders created in the shared pool in s3 will be shared with other clients in the pool and replicated by s1 and s2.

A gluster storage pool is created with a few defaults. The default pool size is 2GB. The default volume is /mnt/gluster. The shared folder is /var/data. The default replication is set to 2 replicas.

You can change these defaults when creating the storage pool

flockport createpool gvol gluster -v volume -d shared -a allow -s size

The -v flag specifies the name of the gluster volume. The -d flag specifies the shared folder which will be shared with clients in the pool. The -s flag specifies the pool size and the -i flag specifies IP ranges that are allowed access to the pool.

If you set access controls with allow for IP range then you must ensure clients are within the specified IP range or the share will fail.

You can get pool details with the listpools commands

flockport listpools

This will list details on all storage pools across Flockport managed servers.

You can add and remove clients with the addpool command

flockport addpool gvol s4 s5

This will share the gvol pool with servers s4 and s5.

The /var/data folder will be shared with clients. Any changes made to /var/data will be reflected in all gvol clients and be replicated by the pool servers.

You can remove clients with the delpool command.

flockport delpool gvol client s5

This will remove servers s5 from the gvol pool and stop sharing the /var/data folder.

You can also add allowed subnets to the pool for access control.

flockport addpool gvol allow 192.168.122.0/24

This sets the gvol pool access control IP range to the 192.168.122.0/24 subnet. Any clients added to the pool must be from this subnet range or the share will fail.

You can delete any previously allowed IP ranges with the delpool command

flockport delpool gvol allow 192.168.122.0/24

You can remove the storage pool with the removepool command.

flockport removepool gvol

Use this option very carefully. This will delete the storage pool and all shared folders, and remove Gluster. So backup any data before deleting the pool.

You can also share the storage pool with the sharepool command.

flockport sharepool gvol s3 s4

This shares the gvol pool with servers s3 and s4. This is an alternative to the addpool command.

Similarly you can use the stoppool command to remove clients from the pool.

flockport stoppool gvol s3 s4

This will remove the s3 and s4 clients from the pool

MFS

MFS is relatively less well known but in our experience a suprisingly robust distributed storage solution. It can saturate networks and the performance in our tests has been impressive.

We have gathered a lot of performance data across network and storage technologies that we will start sharing shortly.

MFS has decent management tools and even a GUI to manage the storage platform. Overall it's a well rounded and solid solution.

MFS uses the concept of master, chunks, metaloggers and clients.

The master holds meta data, the chunk servers replicate the data across servers, metaloggers can act as passive backups of the master and clients consume the storage.

MFS has strong performance but requires RAM. You need a minimum of 2GB for the master and chunks servers. More will be better. Master, chunk, metalogger must all be on different systems for resilience.

Do not take this lightly. You should not try to setup a storage pool with less than 2GB for the master and for chunks, even for testing. You will run into all sorts of stability issues.

For this example we will use 4 servers. s1 will be the master server. s2 and s3 will be chunkservers. s4 will be the client.

You can create an MFS storage pool with the createpool command.

flockport createpool mvol mfs -a 192.168.51.0/24 -s 2

This creates the mvol storage pool with default a mfs configuration. You can use any name for the pool. In this case we use mvol.

The -a flag sets the allowed subnet for storage pool. This determines which IP ranges can connect to the master as chunk servers and clients. The -s flag sets the size of the chunks servers. The default subnet is the subnet of the mfs master and the default size is 5GB

Once the pool is created you can add a master server. You need to specify the master server's IP when adding a master.

You can add a master with the addpool command. This next steps will take a few minutes as Flockport install MFS and initiates the storage pool.

flockport addpool mvol master s1 192.168.122.10

This add s1 as mvol's master server with IP 192.168.122.10. This is the IP that will be used by chunk servers to communicate with the master

You can now add chunk servers. A minimum of 2 chunk servers in replication is required to start a pool.

flockport addpool mvol chunk s2 s3

This adds server s2 and s3 as the mvol pool's chunk servers

Now that a basic pool is up you can share it with the addpool command.

flockport addpool mvol client s4

This adds s4 to the mvol storage pool as a client

The folder shared is /mnt/mfsdata and is part of the mfs default settings when you create a storage pool.

Now any data added to s4's /mnt/mfsdata will be replicated to any clients added and by the chunk servers.

You can add more clients to the mvol pool.

flockport addpool mvol client s5

Any data in /mnt/mfsdata will be replicated to the new client. And data added on the new client will be replicated to all other clients.

You can also add multiple clients at one go

flockport addpool mvol client s4 s5

This will add servers s4 and s5 as mvol clients. You can also use the 'sharepool' command to add clients.

flockport sharepool mvol s4 s5

You can change defaults when creating the storage pool with the following flags.

flockport createpool mvol mfs -v volume -d shared -a allow -s size

The -v flag sets the volume folder. This is where chunk servers store their data and is /mnt/mfschunks by default. The -d flag sets the shared folder which will be shared to clients which is /mnt/mfsdata by default. The -a flags sets the allowed subnets which is not set by default. And the -s flag sets the pool size which is 2GB by default. A minimum of 5GB is recommended.

You can remove clients with the delpool command.

flockport delpool mvol client s5

This will remove the client s5 from the mvol pool.

MFS provides an admin GUI which is available on port 9250. You can forward this port to your host and access the GUI on your browser at hostip:9425

You can also remove chunks. But at any given time there must be 2 chunks in a mfs storage pool.

A chunk server must be 'marked' before it can be removed.

You can mark a chunk server with the changepool command

flockport changepool mvol mark s2

This marks the s2 chunk server for removal. You can also mark a chunk server for removal in the MFS admin GUI.

Once a chunk server is marked you can remove it with the delpool command.

flockport delpool mvol chunk s2

Remember there must be a minimum of 1 chunk server in the pool at any given time. You will not be able to delete a chunk server if there are less than 2 in a pool.

You can also add metallogger server to shadow the master.

flockport addpool mvol metalogger s7

This adds s7 server as a metalogger to the mvol pool.

The metalogger can act as a shadow and takeover as master incase there are issues with the current master