Synchronous replication systems use eager replication. Nodes in the cluster synchronize with all other nodes by updating the replicas through a single transaction. Meaning that, when a transaction commits, all nodes have the same value. This process takes place using write-set replication through group communication.
The internal architecture of Galera Cluster revolves around four components:
- wsrep hooks The integration with the database server engine for write-set replication.
- dlopen() The function that makes the wsrep provider available to the wsrep hooks.
The wsrep API is a generic replication plugin interface for databases. It defines a set of application callbacks and replication plugin calls.
The wsrep API uses a replication model that considers the database server to have a state. The state refers to the contents of the database. When a database is in use, clients modify the database content, thus changing its state. The wsrep API represents the changes in the database state as a series of atomic changes, or transactions.
In a database cluster, all nodes always have the same state. They synchronize with each other by replicating and applying state changes in the same serial order.
From a more technical perspective, Galera Cluster handles state changes in the following process:
For each node in the cluster, the application process occurs by high-priority transaction(s).
In order to keep the state identical across the cluster, the wsrep API uses a Global Transaction ID, or GTID. This allows it to identify state changes and to identify the current state in relation to the last state change.
The Global Transaction ID consists of the following components:
The Global Transaction ID allows you to compare the application state and establish the order of state changes. You can use it to determine whether or not a change was applied and whether the change is applicable at all to a given state.
From a more technical perspective, the Galera Replication Plugin consists of the following components:
The Group Communication Framework provides a plugin architecture for the various gcomm systems.
Galera Cluster is built on top of a proprietary group communication system layer, which implements a virtual synchrony QoS. Virtual synchrony unifies the data delivery and cluster membership services, providing clear formalism for message delivery semantics.
While virtual synchrony guarantees consistency, it does not guarantee temporal synchrony, which is necessary for smooth multi-master operations. To get around this, Galera Cluster implements its own runtime-configurable temporal flow control. Flow control keeps nodes synchronized to the faction of a second.
In addition to this, the Group Communication Framework also provides a total ordering of messages from multiple sources. It uses this to generate Global Transaction ID‘s in a multi-master cluster.
At the transport level, Galera Cluster is a symmetric undirected graph. All database nodes connect to each other over a TCP connection. By default TCP is used for both message replication and the cluster membership services, but you can also use UDP multicast for replication in a LAN.