couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Gonzalez <gonva...@gonvaled.com>
Subject Replications stopping unexpectedly
Date Fri, 27 Apr 2012 10:03:28 GMT
Hello,

I will describe my problem in a general way. If more details are needed, I
will try to gather them from my production environments.
We have several couchdb instances, with a bunch of databases. Some of these
databases are connected via replication.
Some of the replications are working via an ssh-tunnel, others by direct
internet connection. The latency between couchdb instances ranges between
few milliseconds to up de several hundreds of milliseconds.

My problem is that it is very common for the replications to stop. It could
due to connectivity being lost (sometimes the ssh tunnels fail and must be
recreated), but this is not the only reason.

And worse: the replications are not restarted automatically. They stay in
error. The problem is so frequent that I have a replication monitor process
looking for erroneous replications, and deleting and recreating the
replication documents of those replications in error, every 5 minutes. This
is the only method I have found to reliably restart the replications.

Is somebody else experiencing similar problems? Do you have any suggestion
on how to make replications more robust in front of connectivity issues?
Are there other methods to
restart erroneous replications, apart from redefining them?

Thanks,
Daniel Gonzalez

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message