Server Error Log

Node 0 (XXX) requested state transfer from '*any*'. Selected 1 (XXX) as donor.

The node is attempting to initiate a State Snapshot Transfer.

In the event that you do not explicitly set the donor node through wsrep_sst_donor, the Group Communication module selects a donor based on the information available about the node states.

Group Communication monitors node states for the purposes of flow control, state transfers and quorum calculations. That is, to ensure that a node that shows as JOINING does not count towards flow control and quorum.

The node can serve as a donor when it is in the SYNCED state. The joiner node selects a donor from the available synced nodes. It shows preference to synced nodes that have the same gmcast.segment wsrep Provider option or it selects the first in the index. When the donor node is chosen its state changes immediately to DONOR, meaning that it is no longer available for requests.

If the node can find no free nodes that show as SYNCED, the joining node reports:

Requesting state transfer failed: -11(Resource temporarily
unavailable).  Will keep retrying every 1 second(s).

The joining node continues to retry the state transfer request.

SQL SYNTAX Errors

When a State Snapshot Transfer fails using mysqldump for any reason, the node writes a SQL SYNTAX message into the server error logs.

This is a pseudo-statement. You can find the actual error message the state transfer returned within the SQL SYNTAX entry. It provides the information you need to correct the problem.

Commit failed for reason: 3

When you have wsrep_debug turned ON, you may occasionally see a message noting that a commit has failed due to reason 3. For example:

110906 17:45:01 [Note] WSREP: BF kill (1, seqno: 16962377), victim:  (140588996478720 4) trx: 35525064
110906 17:45:01 [Note] WSREP: Aborting query: commit
110906 17:45:01 [Note] WSREP: kill trx QUERY_COMMITTING for 35525064
110906 17:45:01 [Note] WSREP: commit failed for reason: 3, seqno: -1

When attempting to apply a replicated write-set, slave threads occasionally encounter lock conflicts with local transactions, which may already be in the commit phase. In such cases, the node aborts the local transaction, allowing the slave thread to proceed.

This is a consequence of optimistic transaction execution. The database server executes transaction under the expectation that there will be no row conflicts. It is an expected issue in a multi-master configuration.

To mitigate such conflicts:

  • Use the cluster in a master-slave configuration. Direct all writes to a single node.
  • Use the same approaches as for master-slave read/write splitting.