Support for Transaction Isolation Levels

Length: 1009 words; Writer: Seppo Jaakola; Published: September 21, 2015; Topic: General; Level: Intermediate

There appears to be great misunderstanding as to what MySQL transaction isolation levels Galera Cluster actually supports and how. This page tries to give answer to those uncertainties.

In general, Galera Cluster can support transaction isolation levels up to REPEATABLE READ (see ./documentation/isolation-levels).

Transactions running on the same node are isolated by whatever was configured as the transaction isolation level in the MySQL configuration. So, if you have configured the default REPEATABLE READ isolation level, transactions issued on the same node will behave under REPEATABLE READ semantics. In addition, transactions issued on separate cluster nodes, are subject to the ‘first committer wins’ rule. This will avoid the lost update problem that generally hurts REPEATABLE READ isolation.

Here is an example showing how the lost update anomaly hurts transactions using plain REPEATABLE READ isolation. Two transactions read a value from a row in table t and then update the value and commit. The transactions are issued on the same node, and will behave under MySQL’s REPEATABLE READ isolation.

Transaction #1                  Transaction #2

node1> begin;
Query OK, 0 rows affected

                             node1> begin;
                             Query OK, 0 rows affected

node1> select * from t;
+---+------+
| i | j    |
+---+------+
| 1 |    0 |
+---+------+
1 row in set

                             node1> select * from t;
                             +---+------+
                             | i | j    |
                             +---+------+
                             | 1 |    0 |
                             +---+------+
                             1 row in set

node1> update t set j=1 where i=1;
Query OK, 1 row affected

                             node1> update t set j=2
                             node1> where i=1;
                             # Waits for InnoDB lock ...

node1> commit;
Query OK, 0 rows affected

                             Query OK, 1 row affected
                             # Lock waiting ends

node1> select * from t;
+---+------+
| i | j    |
+---+------+
| 1 |    1 |
+---+------+
1 row in set
# Transaction #1's update is now visible

                             node1> commit;

node1> select * from t;
+---+------+
| i | j    |
+---+------+
| 1 |    2 |
+---+------+
1 row in set

                             node1> select * from t;
                             +---+------+
                             | i | j    |
                             +---+------+
                             | 1 |    2 |
                             +---+------+
                             1 row in set

# Transaction #2 has written over transaction #1's result

Both transactions have read the same value in the row (0), and made their decision on how to change the value based on the application logic. But, as the outcome of running these transactions in parallel, transaction #1’s write was visible in the database for a short while, but it went unnoticed by transaction #2 who blindly wrote over its value (2) in the table. Had it noticed that the value had changed to 1, the application logic may have dictated another value to be written in the table.

Here is the same exercise, but now transactions #1 and #2 connect to separate cluster nodes (node1 and node2).

Transaction #1                  Transaction #2

node1> begin;
Query OK, 0 rows affected

                             node2> begin;
                             Query OK, 0 rows affected

node1> select * from t;
+---+------+
| i | j    |
+---+------+
| 1 |    0 |
+---+------+
1 row in set
                             node2> select * from t;
                             +---+------+
                             | i | j    |
                             +---+------+
                             | 1 |    0 |
                             +---+------+
                             1 row in set

node1> update t set j=1 where i=1;
Query OK, 1 row affected

                             node2> update t set j=2
                             node2> where i=1;
                             Query OK, 1 row affected
                             # There is no lock wait here,
                             # as we are operating
                             # on another node

node1> commit;
Query OK, 0 rows affected
                             node2> commit;
                             ERROR 1213 (40001): Deadlock
                             found when trying to get lock;
                             try restarting transaction

node1> select * from t;
+---+------+
| i | j    |
+---+------+
| 1 |    1 |
+---+------+
1 row in set
                             node2> select * from t;
                             +---+------+
                             | i | j    |
                             +---+------+
                             | 1 |    1 |
                             +---+------+
                             1 row in set

# Transaction #1, the first committer, has won

Here the ‘first committer wins’ rule is applied, and only transaction #1 is let to commit. Transaction #2 will notice that it is working on the same database snapshot as transaction #1 and has a conflicting write, therefore it aborts and returns a deadlock error back to the client.

Therefore, transactions on multiple nodes are protected from the ‘lost update’ problem. The lost update phenomenon can hurt only transactions that are using REPEATABLE READ isolation and are running on the same node. However, MySQL/InnoDB has a workaround even for this situation by using SELECT FOR UPDATE, which will lock the rows read, and prevent other transactions from operating on this ‘read view’ until the transaction finally commits. Here is an example showing how transactions #1 and #2 can tackle the lost update problem by using the read locks granted by SELECT FOR UPDATE:

Transaction #1                  transaction #2

node1> begin;
Query OK, 0 rows affected
                             node1> begin;
                             Query OK, 0 rows affected

node1> select * from t for update;
+---+------+
| i | j    |
+---+------+
| 1 |    0 |
+---+------+
1 row in set
                             node1> select * from t
                             node1> for update;
                             # Blocks waiting
                             # for InnoDB lock

node1> update t set j=1 where i=1;
Query OK, 1 row affected

node1> commit;
Query OK, 0 rows affected

                             +---+------+
                             | i | j    |
                             +---+------+
                             | 1 |    1 |
                             +---+------+
                             1 row in set
                             # Lock wait ends
                             # We see transaction #1's
                             # result and work on
                             # a different snapshot now

                             node1> update t set j=3
                             node1> where i=1;
                             Query OK, 1 row affected

                             node1>  commit;
                             Query OK, 0 rows affected
node1> select * from t;
+---+------+
| i | j    |
+---+------+
| 1 |    3 |
+---+------+
1 row in set
                              node1> select * from t;
                              +---+------+
                              | i | j    |
                              +---+------+
                              | 1 |    3 |
                              +---+------+
                              1 row in set

Galera Cluster can support transaction isolation levels up to REPEATABLE READ and also protect against lost update problem if the application can be tuned to use proper locking strategy like the one shown above.

Galera does not support SERIALIZABLE isolation in multi-primary topology, because there is currently no means to protect read locks from being overwritten by the replication. SERIALIZABLE isolation should work in controlled single-primary topologies, but in practice its use is not recommended at all. And, SERIALIZABLE isolation may be disabled in future releases, unless we can figure out a way to support it in a safe way.