Galera as central building block for OpenStack high availability

By Erkan Yanar 02-11-2014

 

OpenStack is likely the most popular open-source cloud computing platform (for IaaS or PaaS). Running OpenStack for yourself or for customers you likely want to run it without losing data and be able to scale while business grows. In this blog we are going to give a short overview to the OpenStack architecture, focusing on the Database(MySQL/MariaDB) part. Then we are going to discuss why Galera is an essential building block to achieve HA of the database and provide scaling even for very big installations.

 

The quite basic setup consists of:

A dashboard (program codename Horizon) providing a WebUI to manage OpenStack. Authentication is done via the identity service (Keystone). The Images service (Glance) provides images i.e. Ubuntu LTS, Centos7, etc. to start from. The Nova service starts them on a compute node. If you like to have persistent volumes you can also attach block storage (Cinder). There is also the network service Neutron.

So this setup is sufficient to provide on-demand self-services.. Running any installation you got to consider, what happens when one of the components (i.e. the keystone) fails? The answer is quite easy. As long as the database is available, no service loses its data. All the components store their informations (like configuration, user and runtime data) in the database. As all the important data is in the databases. There is nothing lost by reinstalling a failed node/service. You can even run services multiple times connecting the same databases providing an easy failover for the service.

While Keystone and Dashboard are quite easy examples to recover using the database. A crashing compute node (running Nova) has a bigger impact, as you lose all VM instances on that node too. While the volumes are most likely in a storage system providing redundancy itself (like Ceph or GlusterFS). The *lost* VM can be migrated using the database providing all the (meta)informations for the VM’s.

OpenStack ships a command evacuating [all] instances from a dead node to another ( `nova [host-]evacuate` *fallen_host* *new_host*).

The most important task the command does is changing the host the instance runs on in the database (uuid is the uuid of the instance).

 

   mysql> UPDATE instances SET host =’newhost’ where uuid=’11d5995e-51e1-45be-b5ee-dd0e3a2319ad’;

 

Then to tell nova to boot that instance.

 

   # nova reboot –hard 11d5995e-51e1-45be-b5ee-dd0e3a2319ad

 

The nova service will read from the database and start the instance on the new host. http://docs.openstack.org/admin-guide-cloud/content/nova-compute-node-down-manual-recovery.html

 

So while every service stores his information in the database, we’ve got to make sure the database does not fail or lose data.  OpenStack supports different database backends. MySQL is the most used and stable one.

 

Galera beats the competition:

 

Even we recommend using Galera Cluster, let’s have a look at the two most

popular ways to build an available MySQL Cluster for OpenStack is using

* MySQL Replication

* Shared Storage (SAN,DRBD)

 

MySQL Replication is asynchronous. Even the Semi-Synchronous Replication does not provide any synchronicity between master and slaves. While Master-Slave setups are quite good for read scale-outs you can’t use that feature as OpenStack does not provide any read/write splitting. So the slave can be used for failover or for separate reporting. With Galera all queries (read and write) can be distributed over the nodes i.e. using a proxy.

 

Using Shared Storage you have a single point of failure. In a case of a failover you have additional recovery time as the database is marked as crashed. There is also no scaling as you have one running node at all.

For Replication and Shared Storage you need to persist every write

    innodb_flush_at_trx_commit=1

    sync_binlog=1

    ..

This reduces the throughput of MySQL. Using a shared storage even extra latency is added.

Regarding SAN the latency to the Storage and regarding DRBD the latency between the nodes.

Loosing data means i.e. OpenStack doesn’t know where the Volume was attached to. A created instance will never show up in the billing or even your dashbord.

So using Replication it is maybe a better idea to do only failover to a read-only slave and to try to recover as much data as possible from the master.

With Galera Cluster you don’t have that hassle. There is no need to put pressure on a local disk, as all data is already synchronously replicated to the other nodes. Using HAProxy or LVS in front of Galera there is nearly no service impact if one of the Galera Cluster nodes fails. As a node that goes down simply does not do service. The other nodes keep on working like before and get some additional queries.

 

This is where you most likely want to use Galera Cluster

 

Providing synchronous replication you don’t lose data when a node fails.

Providing multi-master replication every node can be utilized for an easy scale out.

So running your Galera Cluster, i.e. behind a loadbalancer, nodes may fail and Galera Cluster keeps on serving without losing data. Having the metadata secured in Galera Cluster recovering failed OpenStack services is always possible.

There are also other benefits using Galera Cluster. Restarting the whole cluster (one by one) without downtime (Rolling Restart, Rolling Upgrade). With this feature no downtime has to be planned for configuration changes, software upgrades or hardware upgrades.

Galera Cluster provides two methods to resync nodes. The Incremental State Transfer (IST) sends the missing transactions from a local cache (gcache) to the restarted node. If not all transactions are in the cache it falls back to the State Snapshot Transfer (SST), which does a full sync of the data. SST is also used to automatically integrate new nodes into the cluster.

So replacing and adding nodes is easy with Galera Cluster.

 

Conclusions:

MySQL is a central part in OpenStack.

Using Galera Cluster as the synchronous multi-master replication provides you with:

* protect  your data and quarantee 24/7 service availability
* Enabling simple high availability for the service (Galera is being adopted by the main OpenStack vendors to provide easy installation and to avoid the manual method)
* Scale out the service
* Easy maintenance
* Geographically distributed solution
So from this point it is quite clear why Galera Cluster is the “standard” for OpenStack deployments.