Support for large transactions in Galera: Taking Streaming Replication for a spin

We have documented streaming replication, and we have also mentioned how you can use streaming replication, and we also once did a webinar with recording on it, and we wrote a very in-depth blog post on how to use streaming replication to manage large transactions with Galera Cluster.

However, we had a question recently about streaming replication around the usage of wsrep_trx_fragment_unit and wsrep_trx_fragment_size (also, well documented).

So we have a usual 3-node Galera Cluster, and here is a base test:

CREATE DATABASE streaming_test;
USE streaming_test;

CREATE TABLE large_transactions (
    id BIGINT AUTO_INCREMENT PRIMARY KEY,
    data_chunk LONGTEXT,
    insertion_time TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

DELIMITER //
CREATE PROCEDURE generate_test_data(IN chunk_size INT)
BEGIN
    DECLARE i INT DEFAULT 1;
    DECLARE large_data LONGTEXT;
    SET large_data = REPEAT('A', 1024); -- 1KB of data
    
    START TRANSACTION;
    WHILE i <= chunk_size DO
        INSERT INTO large_transactions (data_chunk) VALUES (large_data);
        SET i = i + 1;
    END WHILE;
    COMMIT;
END //
DELIMITER ;

Then we will test the options of wsrep_trx_fragment_unit which is: bytes, rows, statements. We also use Galera Manager to monitor streaming replication. Let’s start with bytes:

mysql> SET wsrep_trx_fragment_unit = 'bytes';
Query OK, 0 rows affected (0.00 sec)

mysql> SET wsrep_trx_fragment_size = 1048576;
Query OK, 0 rows affected (0.00 sec)

mysql> SHOW VARIABLES LIKE 'wsrep_trx_fragment%';
+-------------------------+---------+
| Variable_name           | Value   |
+-------------------------+---------+
| wsrep_trx_fragment_size | 1048576 |
| wsrep_trx_fragment_unit | bytes   |
+-------------------------+---------+
2 rows in set (0.00 sec)

mysql> call generate_test_data(1000000);
Query OK, 0 rows affected (1 min 58.67 sec) 

This completed in 1 minute and 58.67 seconds. Reasonable. Let’s try this with rows:

mysql> SET wsrep_trx_fragment_unit = 'rows';
Query OK, 0 rows affected (0.00 sec)

mysql> SET wsrep_trx_fragment_size = 1000; 
Query OK, 0 rows affected (0.00 sec)

mysql> SHOW VARIABLES LIKE 'wsrep_trx_fragment%';
+-------------------------+-------+
| Variable_name           | Value |
+-------------------------+-------+
| wsrep_trx_fragment_size | 1000  |
| wsrep_trx_fragment_unit | rows  |
+-------------------------+-------+
2 rows in set (0.00 sec)

mysql> call generate_test_data(1000000);
Query OK, 0 rows affected (1 min 57.81 sec)

Almost the same execution time! You can also keep track of the status via the wsrep.streaming_log system table. Here’s a quick example:

mysql> select count(*) from mysql.wsrep_streaming_log;
+----------+
| count(*) |
+----------+
|     1000 |
+----------+
1 row in set (0.01 sec)

Let’s try with statements:

mysql> SET wsrep_trx_fragment_unit = 'statements';
Query OK, 0 rows affected (0.00 sec)

mysql> SET wsrep_trx_fragment_size = 3; 
Query OK, 0 rows affected (0.00 sec)

mysql> SHOW VARIABLES LIKE 'wsrep_trx_fragment%';
+-------------------------+------------+
| Variable_name           | Value      |
+-------------------------+------------+
| wsrep_trx_fragment_size | 3          |
| wsrep_trx_fragment_unit | statements |
+-------------------------+------------+
2 rows in set (0.00 sec)

mysql> select count(*) from mysql.wsrep_streaming_log;
+----------+
| count(*) |
+----------+
|   926770 |
+----------+
1 row in set (1.68 sec)

mysql> call generate_test_data(1000000);

Query OK, 0 rows affected (1 hour 20 min 40.45 sec) 

This took far too long!

 

You might also note that you should specify bytes, not 1M:

mysql> SET wsrep_trx_fragment_size = 1M;
ERROR 1232 (42000): Incorrect argument type to variable 'wsrep_trx_fragment_size'


mysql> SET wsrep_trx_fragment_size = 1048576;
Query OK, 0 rows affected (0.00 sec)

Note that you are setting all this on a session basis, and it is well worth using streaming replication if you have to deal with large transactions.