Commit Failed for Reason 3

Length: 326 words; Published: April 1, 2014; Updated: November 12, 2019; Category: Schema & SQL; Type: Troubleshooting

When you have wsrep_debug turned ON, you may occasionally see a message noting that a commit has failed due to reason 3. This problem can be resolved with a change in topology or with an upgrade.

Scenario

Suppose you enable wsrep_debug on the nodes in your cluster. Then you attempt to change locally the data contained in a database, but you encounter problems. When you check the database error log, you see a message saying that a commit has failed due to reason 3. Below is an example of an excerpt from a database server’s error log showing this:

110906 17:45:01 [Note] WSREP:
   BF kill (1, seqno: 16962377), victim:  (140588996478720 4) trx: 35525064
110906 17:45:01 [Note] WSREP:
   Aborting query: commit
110906 17:45:01 [Note] WSREP:
   kill trx QUERY_COMMITTING for 35525064
110906 17:45:01 [Note] WSREP:
   commit failed for reason: 3, seqno: -1

When attempting to apply a replicated write-set, replica threads occasionally encounter lock conflicts with local transactions, which may already be in the commit phase. In such cases, the node aborts the local transaction, allowing the replica thread to proceed.

This is a consequence of optimistic transaction execution. The database server executes transactions with the expectation that there won’t be any row conflicts. It’s an expected issue in a multi-primary configuration.

Work-Arounds & Solution

To mitigate such conflicts, there are a couple of things you can do. You could use the cluster in a primary-replica configuration: you would direct all writes to a single node. The other work-around is to use the same approach as primary-replica read/write splitting.

The solution may be, though, to upgrade to the latest version of MySQL or MariaDB and the latest version of Galera Cluster. This problem seems to have occurred only in older versions of the database and cluster software.