incubator-couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Randall Leeds <randall.le...@gmail.com>
Subject Re: CouchDB Replication lacking resilience for many database
Date Tue, 11 Oct 2011 01:52:31 GMT
On Mon, Oct 10, 2011 at 17:02, Chris Stockton <chrisstocktonaz@gmail.com>wrote:

> Hello,
>
> On Mon, Oct 10, 2011 at 4:19 PM, Filipe David Manana
> <fdmanana@apache.org> wrote:
> > On Tue, Oct 11, 2011 at 12:03 AM, Chris Stockton
> > <chrisstocktonaz@gmail.com> wrote:
> > Chris,
> >
> > That said work is in the'1.2.x' branch (and master).
> > CouchDB recently migrated from SVN to GIT, see:
> > http://couchdb.apache.org/community/code.html
> >
>
> Thank you very much for the response Filipe, do you possibly have any
> documentation or more detailed summary on what these changes include
> and possible benefits of them? I would love to hear about any tweaking
> or replication tips you may have for our growth issues, perhaps you
> could answer a basic question if nothing else: Do the changes in this
> branch minimize the performance impact of continuous replication on
> many databases?
>

The primary change, as I understand it, is that CouchDB explicitly manages a
pool of HTTP connections per replication.

Previously, the pool was handled by the HTTP client module CouchDB uses,
ibrowse, which pools connections per host/port. Therefore, the pool is
shared between all the replications pulling from a given server.

The config file has settings for changing how these pools behave, under the
replicator heading:
max_http_sessions
max_http_pipeline_size

The first refers to the size of the per-host pool. The second refers to the
number of requests that can be queued for each. Unfortunately, there is one
more setting which is not exposed, which is the maximum number of requests
to try at once per replication (side note to devs, should we provide a quick
patch for this for 1.1.1?), and it is fixed at 100. CouchDB does not
elegantly handle the case when the pool is completely utilized. This is not
a problem in the new replication code in 1.2.

For the time being though, the following formula should hold true or you
will experience problems:

max_http_sessions * max_http_pipeline_size >= 100 * N

where N is the maximum number of concurrent replications triggered to pull
or push from a single host.

I believe this to still be the case in 1.1. I'm pretty sure it was at one
point earlier.

-Randall



>
> Regardless I plan on getting a build of that branch and doing some
> testing of my own very soon.
>
> Thank you!
>
> -Chris
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message