couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Phil May <phil....@motorolasolutions.com>
Subject Cluster Replication batch_size and batch_count Modification
Date Mon, 05 Jun 2017 03:06:59 GMT
I'm writing to check whether modifying replication batch_count and
batch_size parameters for cluster replication is good idea.

Some background – our data platform dev team noticed that under heavy write
load, cluster replication was falling behind. The following warning
messages started appearing in the logs, and the pending_changes value
consistently increased while under load.

[warning] 2017-05-18T20:15:22.320498Z couch-1@couch-1.couchdb <0.316.0>
-------- mem3_sync shards/a0000000-bfffffff/test.1495137986
couch-3@couch-3.couchdb
{pending_changes,474}

What we saw is described in COUCHDB-3421
<https://issues.apache.org/jira/browse/COUCHDB-3421>. In addition, CouchDB
appears to be CPU bound while this is occurring, not I/O bound as would
seem reasonable to expect for replication.

When we looked into this, we discovered in the source two values affecting
replication, batch_size and batch_count. For cluster replication, these
values are fixed at 100 and 1 respectively, so we made them configurable.
We tried various values and it seems increasing the batch_size (and to a
lesser extent) batch_count improves our write performance. As a point of
reference, with batch_count=50 and batch_size=5000 we can handle about
double the write throughput with no warnings. We are experimenting with
other values.

We wanted to know if adjusting these parameters is a sound approach.

Thanks!

- Phil

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message