Return-Path: Delivered-To: apmail-couchdb-dev-archive@www.apache.org Received: (qmail 85652 invoked from network); 3 Mar 2009 14:23:22 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 3 Mar 2009 14:23:22 -0000 Received: (qmail 21826 invoked by uid 500); 3 Mar 2009 14:23:20 -0000 Delivered-To: apmail-couchdb-dev-archive@couchdb.apache.org Received: (qmail 21799 invoked by uid 500); 3 Mar 2009 14:23:20 -0000 Mailing-List: contact dev-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@couchdb.apache.org Delivered-To: mailing list dev@couchdb.apache.org Received: (qmail 21788 invoked by uid 99); 3 Mar 2009 14:23:20 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 03 Mar 2009 06:23:20 -0800 X-ASF-Spam-Status: No, hits=-1998.8 required=10.0 tests=ALL_TRUSTED,FS_REPLICA X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 03 Mar 2009 14:23:18 +0000 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id 1E36C234C4AA for ; Tue, 3 Mar 2009 06:22:56 -0800 (PST) Message-ID: <484339853.1236090176109.JavaMail.jira@brutus> Date: Tue, 3 Mar 2009 06:22:56 -0800 (PST) From: "Jeff Hinrichs (JIRA)" To: dev@couchdb.apache.org Subject: [jira] Commented: (COUCHDB-270) Replication w/ Large Attachments Fails In-Reply-To: <984982199.1235794152846.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/COUCHDB-270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12678323#action_12678323 ] Jeff Hinrichs commented on COUCHDB-270: --------------------------------------- Adam, I am reproducing the similar results in my environment (laptop). Although, with less memory??, neither 20M is passing for me. 200x20m push FAIL (connection refused) 200x20m pull FAIL Thank you for your work on this. Now that you are getting closer, should I update the tests to use proper attachments in addition to extremely large documents? If so, I was thinking tests for documents with a 512K body size +: * one massive attachment (20MB) * three attachments (10M/6M/4M) 20M overall in attachments * a dozen attachments (5M/5M/10x1M) 20M overall in attachments If you would like different/additional tests/parameters let me know and I'll update the script to include them. Now that I am on my second cup of joe, I should probably do a test that focuses on a large number of revisions. Currently the tests create databases that are relatively pristine with N revisions, where N is 0 or very small. If you would like this in the tests what N or Ns should be used? Jeff > Replication w/ Large Attachments Fails > -------------------------------------- > > Key: COUCHDB-270 > URL: https://issues.apache.org/jira/browse/COUCHDB-270 > Project: CouchDB > Issue Type: Bug > Components: Database Core > Affects Versions: 0.9 > Environment: Apache CouchDB 0.9.0a748379 > Reporter: Jeff Hinrichs > Attachments: couchdb270_Test.py, quick_fix.diff > > > Attempting to replicate a database with largish attachments (<= ~18MB of attachments in a doc, less thatn 200 docs) from one machine to another fails consistently and at the same point. > Scenario: > Both servers are running from HEAD and I've been tracking for some time. This problem has been around as long as I've been using couch. > Machine A holds the original database, Machine B is the server that is doing a PULL replication > During the replication, Machine A starts showing the following sporadically in the log: > [Fri, 27 Feb 2009 14:02:48 GMT] [debug] [<0.5902.3>] 'GET' > /delasco-invoices/INV00652429?revs=true&attachments=true&latest=true&open_revs=["425644723"] > {1, > 1} > Headers: [{'Host',"192.168.2.52:5984"}] > [Fri, 27 Feb 2009 14:02:48 GMT] [error] [<0.5901.3>] Uncaught error in > HTTP request: {exit,normal} > [Fri, 27 Feb 2009 14:02:48 GMT] [debug] [<0.5901.3>] Stacktrace: > [{mochiweb_request,send,2}, > {couch_httpd,send_chunk,2}, > {couch_httpd_db,db_doc_req,3}, > {couch_httpd_db,do_db_req,2}, > {couch_httpd,handle_request,3}, > {mochiweb_http,headers,5}, > {proc_lib,init_p,5}] > [Fri, 27 Feb 2009 14:02:48 GMT] [debug] [<0.5901.3>] HTTPd 500 error response: > {"error":"error","reason":"normal"} > As the replication continues, the frequency of these error "Uncaught error in HTTP request: {exit,normal}" increase. Until the error is being constantly repeated. Then Machine B stops sending requests, no more log output, no errors, the last thing in Machine B's log file is: > [Fri, 27 Feb 2009 14:03:24 GMT] [info] [<0.20893.1>] retrying > couch_rep HTTP get request due to {error, req_timedout}: [104,116, > 116,112,58, > 47,47,49, > 57,50,46, > 49,54,56, > 46,50,46, > 53,50,58, > 53,57,56, > 52,47,100, > 101,108,97, > 115,99,111, > 45,105,110, > 118,111, > 105,99,101, > 115,47,73, > 78,86,48, > 48,54,53, > 50,49,51, > 56,63,114, > 101,118, > 115,61,116, > 114,117, > 101,38,97, > 116,116,97, > 99,104,109, > 101,110, > 116,115,61, > 116,114, > 117,101,38, > 108,97,116, > 101,115, > 116,61,116, > 114,117, > 101,38,111, > 112,101, > 110,95,114, > 101,118, > 115,61,91, > 34, > <<"3070455362">>, > 34,93] > A request for status from the couchdb init.d script returns nothing and checking the processes returns: > (demo-couchdb)jlh@mars:~/projects/venvs/demo-couchdb/src$ ps ax|grep cou > 29281 pts/2 S+ 0:00 grep cou > (demo-couchdb)jlh@mars:~/projects/venvs/demo-couchdb/src$ ps ax|grep beam > 29305 pts/2 R+ 0:00 grep beam > In fact, couch has gone away completely on Machine B. In fact, couch's death is so quick it can't even say why. > Attempts to incrementally replicate after the first failure die at exactly the same place. > I can replicate this same database on the same machine from one database to another without issue. I can dump and reload the database with no problems. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.