couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Bonser <>
Subject Re: Replication scalability on many databases
Date Sat, 12 Jun 2010 04:31:03 GMT
On Fri, Jun 11, 2010 at 5:30 PM, Chris Stockton
<> wrote:
> So my next immediate thought is: why does replication need configured
> per database? Per server replication would work much better for us,
> and could likely be implemented in couchdb for optimal efficiency in
> high database count configurations. Our replication design and
> automatic-failover has grown very complicated, I think at this point
> it might be worth discussion for alternatives within couchdb. I am
> also up for discussion how other couchdb users have implemented
> replication on high (if 4 thousand is really considered so) db count
> machines.

One alternative you might consider if you can't get 4k simultaneous
replications going is an approach which takes CouchDB out of the loop
altogether: rsync.

If you can be sure that no writes will be going to your failover
database until there's a failover, you can use the --partial option to
rsync, which is designed to pick up where an incomplete transfer left
off. Since CouchDB files are append-only, rsync will simply see the
files as missing a bit on the end, and thus transfer them pretty
quickly. The only time it will take longer is after a compaction, at
which point it will need to transfer the whole file once again.

Even if it doesn't work for your failover, it could potentially still
work well for your backup, assuming your backup is really a backup and
not a failover-failover.

Paul Bonser

View raw message