couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Stockton <>
Subject Re: CouchDB Replication lacking resilience for many database
Date Tue, 11 Oct 2011 04:03:00 GMT

On Mon, Oct 10, 2011 at 5:18 PM, Adam Kocoloski <> wrote:
> On Oct 10, 2011, at 8:02 PM, Chris Stockton wrote:
>> Hello,
>> On Mon, Oct 10, 2011 at 4:19 PM, Filipe David Manana
>> <> wrote:
>>> On Tue, Oct 11, 2011 at 12:03 AM, Chris Stockton
>>> <> wrote:
>>> Chris,
>>> That said work is in the'1.2.x' branch (and master).
>>> CouchDB recently migrated from SVN to GIT, see:
>> Thank you very much for the response Filipe, do you possibly have any
>> documentation or more detailed summary on what these changes include
>> and possible benefits of them? I would love to hear about any tweaking
>> or replication tips you may have for our growth issues, perhaps you
>> could answer a basic question if nothing else: Do the changes in this
>> branch minimize the performance impact of continuous replication on
>> many databases?
>> Regardless I plan on getting a build of that branch and doing some
>> testing of my own very soon.
>> Thank you!
>> -Chris
> I'm pretty sure that even in 1.2.x and master each replication with a remote source still
requires one dedicated TCP connection to consume the _changes feed.  Replications with a
local source have always been able to use a connection pool per host:port combination.  That's
not to downplay the significance of the rewrite of the replicator in 1.2.x; Filipe put quite
a lot of time into it.
> The link to "those darn errors" just pointed to the mbox browser for September 2011.
 Do you have a more specific link?  Regards,
> Adam

Well I will remain optimistic that the rewrite could hopefully have
solved several of my issues regardless I hope. I guess the idle TCP
connections by themselves are not too bad, when they all start to work
simultaneously I think is what becomes the issue =)

Sorry Adam, here is a better link,
the actual text was:


It seems that randomly I am getting errors about crashes as our
replicator runs, all this replicator does is make sure that all
databases on the master server replicate to our failover by checking

  - I notice the below error in the logs, anywhere from 0 to 30 at a time.
  - It seems that a database might start replicating okay then stop.
  - These errors [1] are on the failover pulling from master
  - No errors are displayed on the master server
  - The databases inside the URL in the db_not_found portion of the
error, are always available from curl from the failover machine, which
makes the error strange, somehow it thinks it can't find the database
  - Master seems healthy at all times, all database are available, no
errors in log

[1] --
  [Mon, 12 Sep 2011 18:34:14 GMT] [error] [<0.22466.5305>]

View raw message