Distributed storage consolidates storage across servers and makes it available for use to clients. Distributed storage fundamentally depends on the quality of your network.
It is also a tough problem to solve from the technical perspective as writes across the network inevitably introduces latency, reliability and performance issues. Most serious use cases use dedicated storage networks.
Flockport currently lets you add NFS shares and supports building storage pools with Gluster and MFS and these are the technologies we are going to focus in this article. There are other options like Ceph and Orangefs that are also well regarded. There are also projects like DRBD that replicate block devices over the network that we have covered before.
NFS is by far the easiest to setup and use as its part of the Linux kernel. Often all you just need to configure and export a share and install NFS clients on other hosts to consume these shares.
On the server you can simply export a share by adding the folder to be shared to /etc/exports and making the share available by starting the NFS service. Now its a simple matter of configuring clients to consume the share and mounting it locally.
Gluster is better known for performance for large files for instance video files. When it comes to the more typical use cases with a large number of small files the performance is not as good.
Gluster operates on the concepts of peers and volumes. You start by selecting the hosts of your distributed storage pool. These are your peers. You connect them to each other with the Gluster peer probe command. Once connected you can allocate storage folders on each host to make your Gluster volumes. These volumes can then be mounted by clients.
Gluster offers a number of ways to replicate data. You can for instance have 2 way replicas which means data will not only be distributed but also replicated in the volume across at least 2 hosts. Similarly Gluster also uses the concept of striping. Striping breaks up files into smaller parts which are then placed across hosts, making reads faster.
MFS is not as well known but in our experience we found it to be far more robust and easy to use than Gluster. Gluster has version incompatibility and given different distributions are likely to have different versions of Gluster you need to be on an identical distribution release to use Gluster. Or add the Gluster repos and install it manually to get consistent versions across distributions.
MFS operates on the concept of a master server, chunk servers, a metadata server and clients.
MFS has strict memory requirements and you must ensure the master and chunk server should have a minimum of 2GB RAM
You start building an MFS cluster by selecting a master server. This will be responsible for managing the cluster. Then select chunk servers. This is where the actual data will be stored. The metadata server is not essential but acts as a backup of the master server. Once the chunk servers are online you can mount the MFS volume on clients.
MFS has a GUI and dashboard that provides indepth information on the storage cluster.
Flockport supports adding shared folders with NFS and building distributed storage pool with Gluster and MFS.
There are a few other distributed solutions we have not covered. Ceph is well known and offers block, object and file storage but we have found setup to be complex. There is also OrangeFS which we have not yet explored in detail.
Flockport is a new container orchestration platform focused on showcasing the ease and flexibility of containers. Flockport currently supports LXC containers.
Flockport's new platform provides an app store, orchestration across servers, advanced networking and distributed storage support, service discovery, load balancing, HA, container builds and deployment automation.