Galera: runtime adjustment of applier threads

In a MariaDB Galera node, writesets received from other node(s) can be applied parallely through multiple applier threads. The number of slave applier threads is controlled by server’s wsrep_slave_threads system variable. Its a dynamic variable and thus the number of slave applier threads can be adjusted in runtime. The current number of slave applier threads can be checked through “SHOW PROCESSLIST” or wsrep_thread_count status variable (MDEV-6206).

One interesting point to note here is that when the number of @@global.wsrep_slave_threads is increased at runtime, the additional requested applier threads gets spawned immediately. However, when the number is decreased, the effect can not be noticed immediately. What happens internally is that when the number is decreased, the extra applier threads are not killed right away. The process is deferred, and the extra threads exit gracefully only after each apply one last writeset (transaction) after receiving it. So, one will not notice the number of applier threads decreasing on an idle node. The thread count will decrease only after the node starts receiving writesets to apply.

Here are some snippets to help you understand this further :

1. Calculate the change.

  wsrep_slave_count_change += (var->value->val_int() - wsrep_slave_threads);

2a: If its positive, spawn new applier threads.

  if (wsrep_slave_count_change > 0)
    wsrep_slave_count_change = 0;

2b: else mark the thread as “done” after it has applied (commits or rollbacks) the given last writeset.

wsrep_cb_status_t wsrep_commit_cb(void*         const     ctx,
                                  uint32_t      const     flags,
                                  const wsrep_trx_meta_t* meta,
                                  wsrep_bool_t* const     exit,
                                  bool          const     commit)
  if (commit)
    rcode = wsrep_commit(thd, meta->gtid.seqno);
    rcode = wsrep_rollback(thd, meta->gtid.seqno);


  if (wsrep_slave_count_change < 0 && commit && WSREP_CB_SUCCESS == rcode)
    if (wsrep_slave_count_change < 0)
      *exit = true;

And the thread exits :

static void wsrep_replication_process(THD *thd)
  rcode = wsrep->recv(wsrep, (void *)thd);
  DBUG_PRINT("wsrep",("wsrep_repl returned: %d", rcode));

  WSREP_INFO("applier thread exiting (code:%d)", rcode);