We have documented streaming replication, and we have also mentioned how you can use streaming replication, and we also once did a webinar with recording on it, and we wrote a very in-depth blog post on how to use streaming replication to manage large transactions with Galera Cluster.
However, we had a question recently about streaming replication around the usage of wsrep_trx_fragment_unit and wsrep_trx_fragment_size (also, well documented).
So we have a usual 3-node Galera Cluster, and here is a base test:
CREATE DATABASE streaming_test;
USE streaming_test;
CREATE TABLE large_transactions (
id BIGINT AUTO_INCREMENT PRIMARY KEY,
data_chunk LONGTEXT,
insertion_time TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
DELIMITER //
CREATE PROCEDURE generate_test_data(IN chunk_size INT)
BEGIN
DECLARE i INT DEFAULT 1;
DECLARE large_data LONGTEXT;
SET large_data = REPEAT('A', 1024); -- 1KB of data
START TRANSACTION;
WHILE i <= chunk_size DO
INSERT INTO large_transactions (data_chunk) VALUES (large_data);
SET i = i + 1;
END WHILE;
COMMIT;
END //
DELIMITER ;
Then we will test the options of wsrep_trx_fragment_unit which is: bytes, rows, statements. We also use Galera Manager to monitor streaming replication. Let’s start with bytes:
mysql> SET wsrep_trx_fragment_unit = 'bytes';
Query OK, 0 rows affected (0.00 sec)
mysql> SET wsrep_trx_fragment_size = 1048576;
Query OK, 0 rows affected (0.00 sec)
mysql> SHOW VARIABLES LIKE 'wsrep_trx_fragment%';
+-------------------------+---------+
| Variable_name | Value |
+-------------------------+---------+
| wsrep_trx_fragment_size | 1048576 |
| wsrep_trx_fragment_unit | bytes |
+-------------------------+---------+
2 rows in set (0.00 sec)
mysql> call generate_test_data(1000000);
Query OK, 0 rows affected (1 min 58.67 sec)
This completed in 1 minute and 58.67 seconds. Reasonable. Let’s try this with rows:
mysql> SET wsrep_trx_fragment_unit = 'rows';
Query OK, 0 rows affected (0.00 sec)
mysql> SET wsrep_trx_fragment_size = 1000;
Query OK, 0 rows affected (0.00 sec)
mysql> SHOW VARIABLES LIKE 'wsrep_trx_fragment%';
+-------------------------+-------+
| Variable_name | Value |
+-------------------------+-------+
| wsrep_trx_fragment_size | 1000 |
| wsrep_trx_fragment_unit | rows |
+-------------------------+-------+
2 rows in set (0.00 sec)
mysql> call generate_test_data(1000000);
Query OK, 0 rows affected (1 min 57.81 sec)
Almost the same execution time! You can also keep track of the status via the wsrep.streaming_log system table. Here’s a quick example:
mysql> select count(*) from mysql.wsrep_streaming_log;
+----------+
| count(*) |
+----------+
| 1000 |
+----------+
1 row in set (0.01 sec)
Let’s try with statements:
mysql> SET wsrep_trx_fragment_unit = 'statements';
Query OK, 0 rows affected (0.00 sec)
mysql> SET wsrep_trx_fragment_size = 3;
Query OK, 0 rows affected (0.00 sec)
mysql> SHOW VARIABLES LIKE 'wsrep_trx_fragment%';
+-------------------------+------------+
| Variable_name | Value |
+-------------------------+------------+
| wsrep_trx_fragment_size | 3 |
| wsrep_trx_fragment_unit | statements |
+-------------------------+------------+
2 rows in set (0.00 sec)
mysql> select count(*) from mysql.wsrep_streaming_log;
+----------+
| count(*) |
+----------+
| 926770 |
+----------+
1 row in set (1.68 sec)
mysql> call generate_test_data(1000000);
Query OK, 0 rows affected (1 hour 20 min 40.45 sec)
This took far too long!
You might also note that you should specify bytes, not 1M:
mysql> SET wsrep_trx_fragment_size = 1M;
ERROR 1232 (42000): Incorrect argument type to variable 'wsrep_trx_fragment_size'
mysql> SET wsrep_trx_fragment_size = 1048576;
Query OK, 0 rows affected (0.00 sec)
Note that you are setting all this on a session basis, and it is well worth using streaming replication if you have to deal with large transactions.