On Fri, Feb 27, 2009 at 8:57 AM, Adam Kocoloski wrote: > Hi Jeff, I can pick this one up, but not before Monday. We do have some > replicating-attachment JIRA tickets open and active, but it looks like > there's some new stuff in this report too.  Feel free to file another one. >  Best, > > Adam I'll review the current JIRA tickets to avoid a dupe if found, I'll also work on building a reproducible test case for you. Hope that python script is ok with you. Regards, Jeff > > Sent from my iPhone > > On Feb 27, 2009, at 9:13 AM, "Jeff Hinrichs - DM&T" > wrote: > >> Attempting to replicate a database with largish attachments (<= ~18MB >> of attachments in a doc, less thatn 200 docs)  from one machine to >> another fails consistently and at the same point. >> >> Scenario: >> Both servers are running from HEAD and I've been tracking for some >> time.  This problem has been around as long as I've been using couch. >> >> Machine A holds the original database, Machine B is the server that is >> doing a PULL replication >> >> During the replication, Machine A starts showing the following >> sporadically in the log: >> [Fri, 27 Feb 2009 14:02:48 GMT] [debug] [<0.5902.3>] 'GET' >> >> /delasco-invoices/INV00652429?revs=true&attachments=true&latest=true&open_revs=["425644723"] >> {1, >> >>                            1} >> Headers: [{'Host',"192.168.2.52:5984"}] >> >> [Fri, 27 Feb 2009 14:02:48 GMT] [error] [<0.5901.3>] Uncaught error in >> HTTP request: {exit,normal} >> >> [Fri, 27 Feb 2009 14:02:48 GMT] [debug] [<0.5901.3>] Stacktrace: >> [{mochiweb_request,send,2}, >>            {couch_httpd,send_chunk,2}, >>            {couch_httpd_db,db_doc_req,3}, >>            {couch_httpd_db,do_db_req,2}, >>            {couch_httpd,handle_request,3}, >>            {mochiweb_http,headers,5}, >>            {proc_lib,init_p,5}] >> >> [Fri, 27 Feb 2009 14:02:48 GMT] [debug] [<0.5901.3>] HTTPd 500 error >> response: >> {"error":"error","reason":"normal"} >> >> As the replication continues, the frequency of these error "Uncaught >> error in HTTP request: {exit,normal}"  increase.  Until the error is >> being constantly repeated.  Then Machine B stops sending requests, no >> mor log output, no errors, the last thing in Machine B's log file is: >> [Fri, 27 Feb 2009 14:03:24 GMT] [info] [<0.20893.1>] retrying >> couch_rep HTTP get request due to {error, req_timedout}: [104,116, >> >>  116,112,58, >>                                                                  47,47,49, >>                                                                  57,50,46, >>                                                                  49,54,56, >>                                                                  46,50,46, >>                                                                  53,50,58, >>                                                                  53,57,56, >> >>  52,47,100, >> >>  101,108,97, >> >>  115,99,111, >> >>  45,105,110, >>                                                                  118,111, >> >>  105,99,101, >> >>  115,47,73, >>                                                                  78,86,48, >>                                                                  48,54,53, >>                                                                  50,49,51, >> >>  56,63,114, >>                                                                  101,118, >> >>  115,61,116, >>                                                                  114,117, >> >>  101,38,97, >> >>  116,116,97, >> >>  99,104,109, >>                                                                  101,110, >> >>  116,115,61, >>                                                                  116,114, >> >>  117,101,38, >> >>  108,97,116, >>                                                                  101,115, >> >>  116,61,116, >>                                                                  114,117, >> >>  101,38,111, >>                                                                  112,101, >> >>  110,95,114, >>                                                                  101,118, >> >>  115,61,91, >>                                                                  34, >> >> <<"3070455362">>, >>                                                                  34,93] >> >> A request for status from the couchdb init.d script returns nothing >> and checking the processes returns: >> (demo-couchdb)jlh@mars:~/projects/venvs/demo-couchdb/src$ ps ax|grep cou >> 29281 pts/2    S+     0:00 grep cou >> (demo-couchdb)jlh@mars:~/projects/venvs/demo-couchdb/src$ ps ax|grep beam >> 29305 pts/2    R+     0:00 grep beam >> >> In fact, couch has gone away completely on Machine B.  In fact, >> couch's death is so quick it can't even say why. >> >> Attempts to incrementally replicate after the first failure die at >> exactly the same place. >> >> I can replicate this same database on the same machine from one >> database to another without issue.  I can dump and reload the database >> with no problems. >> >> I have reported this earlier and no one seemed to have an answer.  Is >> there a specific issue in JIRA that addresses this problem?  If not, >> is what I have here enough to start one and should I? >> >> Regards, >> >> Jeff Hinrichs >