couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Adam Kocoloski (JIRA)" <>
Subject [jira] Commented: (COUCHDB-481) Continuous replication stability issues
Date Tue, 25 Aug 2009 03:43:59 GMT


Adam Kocoloski commented on COUCHDB-481:

I haven't been tagging commits with this ticket number, but here's a list so far

one flat-out bugfix,
r807308, r807354: more precise and accurate calculation of replication progress

one new feature that could be classified as a bugfix depending on your point-of-view,
r807342, r807345: follow 302 redirects during replication

and two significant performance improvements (thanks rnewson for all the stress testing):
r807320, r807360: checkpoint at most once per 5 seconds
r807208, r807459, r807461: minimize the number of full commit operations

> Continuous replication stability issues
> ---------------------------------------
>                 Key: COUCHDB-481
>                 URL:
>             Project: CouchDB
>          Issue Type: Bug
>          Components: Database Core
>    Affects Versions: 0.10
>            Reporter: Robert Newson
>            Assignee: Adam Kocoloski
>            Priority: Blocker
>             Fix For: 0.10
> I've been trying continuous replication with a different combinations of push/pull with
2, 3 and 4 nodes. I've hit several problems and discussed them on IRC with jan___ and kocolosk.
> Firstly, the status page in Futon (and the output of _active_tasks) becomes inaccurate
sometimes (and does not recover). This complicates investigation of the more serious problems.
> I configured a circle of four nodes with continuous pull replication and used 'ab' to
write documents to the first one. Success is for all documents to appear at all nodes. For
small batches of documents, this works. It fails, every time, with large numbers. I use batch=ok
on all requests and have not successfully run a 100k run. 
> The replication task at some point in the circle eventually dumps a huge stacktrace (which
kocolosk has seen and I would need to sanitize private server names from before I could post)
and dies, and is not restarted. Worse, the client process injecting the documents also dies
> I have had perfect replication runs with 2 and 3 nodes in a circle, and no successful
replication runs with 4 nodes. Using a star pattern (where each node pulls or pushes to the
remaining three) fails even more rapidly.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message