incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Blair Zajac <>
Subject Replication and new user questions
Date Tue, 25 Aug 2009 21:10:39 GMT

We're looking at using CouchDB's replication to allow us to easily have 
multi-master replicating databases across multiple facilities, (e.g. Los 
Angeles, Albuquerque, Bristol, England, etc).  It looks like it'll be the 
perfect tool for the job.

Some questions on the current implementation and the work that I've read is 
going to be in forthcoming releases.

1) What's the most robust automatic replication mechanism?  While continuous 
replication looks nice, I see there's some tickets open with it and that it has 
issues with four nodes.  Is a more robust solution, but a little slower and 
heavier, it to have an update_notification that manually POSTs to _replicate?

2) With the persistent continuous replication feature, is there a way to stop 
continuous replication without restarting couchdb?  Will there be a way to 
manage the list of replicant databases when the persistent continuous 
replication feature is complete?

3) How does continuous replication deal with network outages, say if a link goes 
down between the Los Angeles and Bristol data centers?  Does CouchDB deal with a 
hanging TCP connection ok?

4) It would be nice for CouchDB to have in it a list of replicant databases that 
it will automatically push changes to, so this list could also be maintained in 
CouchDB, instead of with an external script.  Is there any work on a feature 
like this?  This could be easily done with an external update_notification script.

5) I wrote the following Bourne shell script and after running it for an hour, 
it consumes 100% of a CPU.  This is even after stopping the shell script and 
compacting both databases.  What would explain this behavior?



curl -X PUT $DB1
curl -X PUT $DB2

curl -X POST $HOST1/_replicate -d '{"source": "db1", "target": "db2",
"continuous": true}'
curl -X POST $HOST2/_replicate -d '{"source": "db2", "target": "db1",
"continuous": true}'

while true; do
   seconds="`date +%s`"
   echo Working on $DB1/$seconds
   rev=`curl -X PUT $DB1/$seconds -d '{"name": "$seconds"}' 2>/dev/null |
python2.6 -c 'import cjson, sys; print

   while curl $DB2/$seconds 2>/dev/null | grep error; do
     echo "  Does not exist yet at $DB2/$seconds."
     sleep 1
   echo "  It exists now at $DB2/$seconds."

   curl -X DELETE "$DB2/$seconds?rev=$rev" >/dev/null 2>&1

   while curl $DB1/$seconds 2>/dev/null | grep _rev; do
     echo "  It has not been deleted yet at $DB1/$seconds"
     sleep 1
   echo "  It has been deleted at $DB1/$seconds."

6) The other thing I noticed is that after compacting the databases, these 
messages appear frequently in the log, when they didn't appear before the 

[info] [<0.185.0>] recording a checkpoint at source update_seq 6748
[info] [<0.185.0>] A server has restarted sinced replication start. Not
recording the new sequence number to ensure the replication is redone and
documents reexamined.

And then later got this:

[info] [<0.4831.9>] - - 'GET' /db1/1251156832 404
[info] [<0.18214.10>] - - 'PUT' /db1/1251156833 201
[error] [<0.186.0>] changes_loop died with reason {system_limit,

Thanks and nice work on the project.


View raw message