Return-Path: Delivered-To: apmail-couchdb-dev-archive@www.apache.org Received: (qmail 36446 invoked from network); 2 Nov 2009 18:41:24 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 2 Nov 2009 18:41:24 -0000 Received: (qmail 69796 invoked by uid 500); 2 Nov 2009 18:41:23 -0000 Delivered-To: apmail-couchdb-dev-archive@couchdb.apache.org Received: (qmail 69734 invoked by uid 500); 2 Nov 2009 18:41:23 -0000 Mailing-List: contact dev-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@couchdb.apache.org Delivered-To: mailing list dev@couchdb.apache.org Received: (qmail 69724 invoked by uid 99); 2 Nov 2009 18:41:23 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 02 Nov 2009 18:41:23 +0000 X-ASF-Spam-Status: No, hits=-1998.8 required=10.0 tests=ALL_TRUSTED,FS_REPLICA X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 02 Nov 2009 18:41:20 +0000 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id 7974C234C498 for ; Mon, 2 Nov 2009 10:40:59 -0800 (PST) Message-ID: <227179985.1257187259496.JavaMail.jira@brutus> Date: Mon, 2 Nov 2009 18:40:59 +0000 (UTC) From: "Adam Kocoloski (JIRA)" To: dev@couchdb.apache.org Subject: [jira] Updated: (COUCHDB-270) Replication w/ Large Attachments Fails In-Reply-To: <984982199.1235794152846.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/COUCHDB-270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam Kocoloski updated COUCHDB-270: ----------------------------------- Component/s: (was: Database Core) Replication Fix Version/s: 0.11 > Replication w/ Large Attachments Fails > -------------------------------------- > > Key: COUCHDB-270 > URL: https://issues.apache.org/jira/browse/COUCHDB-270 > Project: CouchDB > Issue Type: Bug > Components: Replication > Affects Versions: 0.9 > Environment: Apache CouchDB 0.9.0a748379 > Reporter: Jeff Hinrichs > Assignee: Adam Kocoloski > Fix For: 0.11 > > Attachments: couchdb270_Test.py, couchdb270_Test.py, quick_fix.diff > > > Attempting to replicate a database with largish attachments (<= ~18MB of attachments in a doc, less thatn 200 docs) from one machine to another fails consistently and at the same point. > Scenario: > Both servers are running from HEAD and I've been tracking for some time. This problem has been around as long as I've been using couch. > Machine A holds the original database, Machine B is the server that is doing a PULL replication > During the replication, Machine A starts showing the following sporadically in the log: > [Fri, 27 Feb 2009 14:02:48 GMT] [debug] [<0.5902.3>] 'GET' > /delasco-invoices/INV00652429?revs=true&attachments=true&latest=true&open_revs=["425644723"] > {1, > 1} > Headers: [{'Host',"192.168.2.52:5984"}] > [Fri, 27 Feb 2009 14:02:48 GMT] [error] [<0.5901.3>] Uncaught error in > HTTP request: {exit,normal} > [Fri, 27 Feb 2009 14:02:48 GMT] [debug] [<0.5901.3>] Stacktrace: > [{mochiweb_request,send,2}, > {couch_httpd,send_chunk,2}, > {couch_httpd_db,db_doc_req,3}, > {couch_httpd_db,do_db_req,2}, > {couch_httpd,handle_request,3}, > {mochiweb_http,headers,5}, > {proc_lib,init_p,5}] > [Fri, 27 Feb 2009 14:02:48 GMT] [debug] [<0.5901.3>] HTTPd 500 error response: > {"error":"error","reason":"normal"} > As the replication continues, the frequency of these error "Uncaught error in HTTP request: {exit,normal}" increase. Until the error is being constantly repeated. Then Machine B stops sending requests, no more log output, no errors, the last thing in Machine B's log file is: > [Fri, 27 Feb 2009 14:03:24 GMT] [info] [<0.20893.1>] retrying > couch_rep HTTP get request due to {error, req_timedout}: [104,116, > 116,112,58, > 47,47,49, > 57,50,46, > 49,54,56, > 46,50,46, > 53,50,58, > 53,57,56, > 52,47,100, > 101,108,97, > 115,99,111, > 45,105,110, > 118,111, > 105,99,101, > 115,47,73, > 78,86,48, > 48,54,53, > 50,49,51, > 56,63,114, > 101,118, > 115,61,116, > 114,117, > 101,38,97, > 116,116,97, > 99,104,109, > 101,110, > 116,115,61, > 116,114, > 117,101,38, > 108,97,116, > 101,115, > 116,61,116, > 114,117, > 101,38,111, > 112,101, > 110,95,114, > 101,118, > 115,61,91, > 34, > <<"3070455362">>, > 34,93] > A request for status from the couchdb init.d script returns nothing and checking the processes returns: > (demo-couchdb)jlh@mars:~/projects/venvs/demo-couchdb/src$ ps ax|grep cou > 29281 pts/2 S+ 0:00 grep cou > (demo-couchdb)jlh@mars:~/projects/venvs/demo-couchdb/src$ ps ax|grep beam > 29305 pts/2 R+ 0:00 grep beam > In fact, couch has gone away completely on Machine B. In fact, couch's death is so quick it can't even say why. > Attempts to incrementally replicate after the first failure die at exactly the same place. > I can replicate this same database on the same machine from one database to another without issue. I can dump and reload the database with no problems. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.