couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Darren Gibbard (JIRA)" <>
Subject [jira] [Created] (COUCHDB-2070) [1.4.0] CouchDB Replication Crashes
Date Tue, 18 Feb 2014 15:50:21 GMT
Darren Gibbard created COUCHDB-2070:

             Summary: [1.4.0] CouchDB Replication Crashes
                 Key: COUCHDB-2070
             Project: CouchDB
          Issue Type: Bug
      Security Level: public (Regular issues)
          Components: Replication
            Reporter: Darren Gibbard

Hi all,
I have an issue at the moment that appears to have followed me from v1.2.1 with erlang R14,
through to an upgrade to v1.4.0 with R16B01.

I have 20 "remote" nodes, and one "central" node; and each of the remote instances are configured
with Bi-Direction replication (ie. no replication defined on the Central node directly). Single
main database of ~600,000 documents at ~11GB in size.

On the remote nodes, and more frequently the Central node, I get *huge* (3000+ lines) errors
in the logs- seemingly intermittently; I'm yet to track down the root cause here. Open file
handles and ERL_MAX_PORTS are set to values upwards of 16k.

Other stats:
$ sudo su - couchdb -c "lsof | grep -c ."

$ sudo netstat -npla | grep "ESTAB" | grep -c .

$ ps -ef | grep -c "^couchdb" 

An example log from a Remote node is:
An example log from the Central node is:

The main error line is "{error,{error,req_timedout}}}}" for either "_bulk_docs" on remote
nodes, or "_revs_diff" on the central node it would seem.

This message was sent by Atlassian JIRA

View raw message