couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adam Kocoloski <>
Subject Re: Replicated database size
Date Mon, 15 Mar 2010 19:38:55 GMT
On Mar 15, 2010, at 3:09 PM, Matthew Sinclair-Day wrote:

> Hi folks,
> I've been putting couch 10.1 on Solaris 10/x86 through its paces lately trying to understand
its replication performance and behavior, and have noticed the size of pre-compacted replicas
can vary from one host to another.
> In one test, the origin has roughly 1.2 million documents taking up 263MB of storage,
but replicated size varies from one server to another:
> origin   : 263MB
> replica 1: 0.6GB
> replica 2: 0.7GB
> replica 3: 1.0GB
> As expected the replicas are larger than the compacted origin database, but I didn't
expect such size differences from replica to replica.
> After compacting the origin (again) and the replicas, their sizes settle down to:
> origin:  : 262.4MB
> replica 1: 262.4MB
> replica 2: 262.5MB
> replica 3: 262.4MB
> I'm trying to understand what the reason could be for the variance in pre-compacted database
sizes.  All replicas are running the same build of CouchDB on the same version of Solaris,
though replica3 is running on newer hardware in a VMWare container.
> Matt

Hi Matt, the variation in target DB file sizes is due to variations in number and size of
_bulk_docs calls used by the replicator.  The DB size is inversely correlated with the size
of an average _bulk_docs POST, and the size of a POST is governed by the relative speed of
the source and the target.  If the target is fast and the replication is limited by the source
throughput you'll see lots of very small calls to _bulk_docs.  Conversely if the target is
slow the replicator will batch writes together in blocks of 1000 and send them over.

In short, the faster your target server is the larger the un-compacted target DB will be.
Looks like that VMWare container isn't slowing you down much at all :)  Best,

View raw message