couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jeff Hinrichs - DM&T" <dunde...@gmail.com>
Subject Re: replication error
Date Fri, 30 Jan 2009 05:27:50 GMT
On Thu, Jan 29, 2009 at 9:12 AM, Adam Kocoloski
<adam.kocoloski@gmail.com> wrote:
> Hi Jeff, thanks for the extra info.  Something funny is going on here.
>  These logs don't agree with your description of how you set up the
> replication.  In particular, in the .52 log it looks like you sent a
> replication request to .52 telling it to pull from itself.  Those debug
> lines that start with Url: are HTTP requests that the replicator is about to
> make.
>
> On .194 the first line in the logfile looks like a response to an HTTP
> request from a remote replicator  trying to pull from .194.  But then in the
> headers you see a {'Host',"192.168.2.52"} tuple.
>
> Could you have mixed up which log was which in this email?  It would make a
> lot more sense.  Let's confirm that first.  Best, Adam
>
> P.S.
>
> The logging in mochiweb_request would look like
>
>     case gen_tcp:send(Socket, Data) of
>         ok ->
>             ok;
> -        _ ->
> +        {error, Reason} ->
> +            io:format("mochiweb_request:send failed with reason ~p",
> [Reason]),
>             exit(normal)
>     end.
>
I am initiating the replication via futon on machine .194, remote:
http://192.168.2.52:5984/delasco-invoices -> local test5-mars

This is what I am seeing from the log file on .194

[Fri, 30 Jan 2009 05:11:47 GMT] [error] [emulator] Error in process
<0.98.0> with exit value:
{function_clause,[{lists,map,[#Fun<couch_rep.10.28922857>,ok]},{couch_rep,open_doc_revs,4},{couch_rep,'-enum_docs_parallel/3-fun-1-',3},{couch_rep,'-spawn_worker/3-fun-0-',3}]}



[Fri, 30 Jan 2009 05:11:47 GMT] [debug] [<0.107.0>] couch_rep HTTP get
request: "http://192.168.2.52:5984/delasco-invoices/INV00541353?revs=true&attachments=true&latest=true&open_revs=[\"2461225383\"]"

[Fri, 30 Jan 2009 05:11:47 GMT] [info] [<0.117.0>] retrying couch_rep
HTTP get request due to {error, connection_closed}:
"http://192.168.2.52:5984/delasco-invoices/INV00541343?revs=true&attachments=true&latest=true&open_revs=[\"2308904194\"]"

[Fri, 30 Jan 2009 05:11:47 GMT] [info] [<0.145.0>] retrying couch_rep
HTTP get request due to {error, connection_closed}:
"http://192.168.2.52:5984/delasco-invoices/INV00541315?revs=true&attachments=true&latest=true&open_revs=[\"3170383356\"]"

[Fri, 30 Jan 2009 05:11:47 GMT] [error] [<0.145.0>] couch_rep HTTP get
request failed after 10 retries:
"http://192.168.2.52:5984/delasco-invoices/INV00541315?revs=true&attachments=true&latest=true&open_revs=[\"3170383356\"]"

[Fri, 30 Jan 2009 05:11:48 GMT] [error] [emulator] Error in process
<0.145.0> with exit value:
{function_clause,[{lists,map,[#Fun<couch_rep.10.28922857>,ok]},{couch_rep,open_doc_revs,4},{couch_rep,'-enum_docs_parallel/3-fun-1-',3},{couch_rep,'-spawn_worker/3-fun-0-',3}]}



[Fri, 30 Jan 2009 05:11:48 GMT] [debug] [<0.117.0>] couch_rep HTTP get
request: "http://192.168.2.52:5984/delasco-invoices/INV00541343?revs=true&attachments=true&latest=true&open_revs=[\"2308904194\"]"

[Fri, 30 Jan 2009 05:11:48 GMT] [info] [<0.138.0>] retrying couch_rep
HTTP get request due to {error, connection_closed}:
"http://192.168.2.52:5984/delasco-invoices/INV00541322?revs=true&attachments=true&latest=true&open_revs=[\"3949544425\"]"

[Fri, 30 Jan 2009 05:11:48 GMT] [debug] [<0.138.0>] couch_rep HTTP get
request: "http://192.168.2.52:5984/delasco-invoices/INV00541322?revs=true&attachments=true&latest=true&open_revs=[\"3949544425\"]"

[Fri, 30 Jan 2009 05:11:50 GMT] [info] [<0.119.0>] retrying couch_rep
HTTP get request due to {error, connection_closed}:
"http://192.168.2.52:5984/delasco-invoices/INV00541341?revs=true&attachments=true&latest=true&open_revs=[\"2313067153\"]"

[Fri, 30 Jan 2009 05:11:50 GMT] [error] [<0.119.0>] couch_rep HTTP get
request failed after 10 retries:
"http://192.168.2.52:5984/delasco-invoices/INV00541341?revs=true&attachments=true&latest=true&open_revs=[\"2313067153\"]"

[Fri, 30 Jan 2009 05:11:50 GMT] [error] [emulator] Error in process
<0.119.0> with exit value:
{function_clause,[{lists,map,[#Fun<couch_rep.10.28922857>,ok]},{couch_rep,open_doc_revs,4},{couch_rep,'-enum_docs_parallel/3-fun-1-',3},{couch_rep,'-spawn_worker/3-fun-0-',3}]}

-------
sometimes the couchdb process on .194 just goes away.

Also, if I attempt to replicate with the same data, only no
attachments or with smaller attachments (70k pdfs) it will run just
fine.  I have svn up and am now running at  0.9.0a739170-incubating --
 on both machines.

I had it go 100 documents, then blowup -- then I restarted couch on
.192 and retried and it finished out the following 88 docs (188 total
in the db on .52)

I'm svn upping again and will try some more.

>
> On Jan 28, 2009, at 7:07 PM, Jeff Hinrichs - DM&T wrote:
>
>> On Wed, Jan 28, 2009 at 10:03 AM, Adam Kocoloski
>> <adam.kocoloski@gmail.com> wrote:
>>>
>>> Hi Jeff, I think I'll need a reproducible test case or a little more
>>> information to help debug this.  mochiweb_request:send exits on any error
>>> returned by the underlying gen_tcp:send, and unfortunately it doesn't
>>> bother
>>> to log the reason for the error.  You might try adding a debug statement
>>> to
>>> line 125 of mochiweb_request.erl to figure out the reason why .52 failed
>>> to
>>> serve this document GET request.
>>
>> Not an erlanger but can vi, can you tell me what to put there?  Currently
>> it is:
>>           exit(normal)
>>
>>>
>>> When you say that "the process has died" on .194, you mean the
>>> replication
>>> process, right?  Surely that error didn't bring down the entire database?
>>> Best,
>>>
>>> Adam
>>
>> Sorry, but I mean the entire couchdb process, not just the replication
>> process
>> I initiate the request from the remote machine (.194)  it fails out
>> after a while with, the error, then
>>
>> jlh@mars:~$ ps ax|grep couch
>> 28145 pts/2    S+     0:00 tail -f /usr/local/var/log/couchdb/couch.log
>> 28375 pts/3    S+     0:00 grep couch
>> jlh@mars:~$ sudo /usr/local/etc/init.d/couchdb status
>>
>> jlh@mars:~$
>>
>> All couch related processes are now -- gone. on .194
>>
>> -----------------
>> .194 log shows:
>> [Wed, 28 Jan 2009 23:50:46 GMT] [debug] [<0.1742.0>] 'GET'
>>
>> /invoices1/INV00541323?revs=true&attachments=true&latest=true&open_revs=["2017454730"]
>> {1,
>>
>>                      1}
>> Headers: [{'Connection',"keep-alive"},{'Host',"192.168.2.52"},{"Te",[]}]
>>
>> [Wed, 28 Jan 2009 23:50:46 GMT] [error] [<0.1742.0>] Uncaught error in
>> HTTP request: {exit,normal}
>>
>> [Wed, 28 Jan 2009 23:50:46 GMT] [debug] [<0.1742.0>] Stacktrace:
>> [{mochiweb_request,send,2},
>>            {couch_httpd,send_chunk,2},
>>            {couch_httpd_db,'-db_doc_req/3-fun-1-',4},
>>            {lists,foldl,3},
>>            {couch_httpd_db,db_doc_req,3},
>>            {couch_httpd_db,do_db_req,2},
>>            {couch_httpd,handle_request,3},
>>            {mochiweb_http,headers,4}]
>>
>> [Wed, 28 Jan 2009 23:50:46 GMT] [debug] [<0.1742.0>] HTTPd 500 error
>> response:
>> {"error":"error","reason":"normal"}
>> --------------
>> .52 log shows:
>> [Wed, 28 Jan 2009 23:49:35 GMT] [debug] [<0.452.0>]     Url:
>>
>> "http://192.168.2.52:5984/invoices1/INV00541300?revs=true&attachments=true&latest=true&open_revs=[\"1219578511\"]"
>>
>> [Wed, 28 Jan 2009 23:49:35 GMT] [debug] [<0.453.0>]     Url:
>>
>> "http://192.168.2.52:5984/invoices1/INV00653664?revs=true&attachments=true&latest=true&open_revs=[\"2059085364\"]"
>>
>> [Wed, 28 Jan 2009 23:49:35 GMT] [debug] [<0.454.0>]     Url:
>>
>> "http://192.168.2.52:5984/invoices1/INV00652895?revs=true&attachments=true&latest=true&open_revs=[\"2562102070\"]"
>>
>> [Wed, 28 Jan 2009 23:49:35 GMT] [debug] [<0.455.0>]     Url:
>>
>> "http://192.168.2.52:5984/invoices1/INV00652894?revs=true&attachments=true&latest=true&open_revs=[\"268796200\"]"
>>
>> [Wed, 28 Jan 2009 23:49:35 GMT] [info] [<0.352.0>] 192.168.2.52 - -
>> 'POST' /_replicate 500
>> -----------------
>> couchdb on the remote machine (.52) is just humming along fine.
>>
>> Let me know what you need and I'll do my best.  Sorry for the long
>> pause between the first report and now.  I was dashing out of the
>> house to work and wanted to get the initial report out.
>>
>> Regards,
>>
>> Jeff
>>
>>>
>>> On Jan 28, 2009, at 10:17 AM, Jeff Hinrichs - DM&T wrote:
>>>
>>>> replicating from 192.168.2.52 [0.9.0a738346-incubating]  ->
>>>> 192.168.2.194 [0.9.0a738497-incubating]
>>>>
>>>> -192.168.2.52:-
>>>> Eshell V5.6.4  (abort with ^G)
>>>> 1> init:script_id().
>>>> {"OTP  APN 181 01","R12B"}
>>>>
>>>> -192.168.2.194-
>>>> Eshell V5.6.3  (abort with ^G)
>>>> 1> init:script_id().
>>>> {"OTP  APN 181 01","R12B"}
>>>>
>>>> replication initiated in futon on .194 pulling from .52
>>>>
>>>>
>>>> During the process, I see this in the log...
>>>>
>>>> [Wed, 28 Jan 2009 14:29:47 GMT] [debug] [<0.62.0>] 'GET'
>>>>
>>>>
>>>> /invoices/INV00651983?revs=true&attachments=true&latest=true&open_revs=["3597612357"]
>>>> {1,
>>>>
>>>>                   1}
>>>> Headers: [{'Connection',"keep-alive"},{'Host',"192.168.2.52"},{"Te",[]}]
>>>>
>>>> [Wed, 28 Jan 2009 14:29:53 GMT] [error] [<0.62.0>] Uncaught error in
>>>> HTTP request: {exit,normal}
>>>>
>>>> [Wed, 28 Jan 2009 14:29:53 GMT] [debug] [<0.62.0>] Stacktrace:
>>>> [{mochiweb_request,send,2},
>>>>          {couch_httpd,send_chunk,2},
>>>>          {couch_httpd_db,'-db_doc_req/3-fun-1-',4},
>>>>          {lists,foldl,3},
>>>>          {couch_httpd_db,db_doc_req,3},
>>>>          {couch_httpd_db,do_db_req,2},
>>>>          {couch_httpd,handle_request,3},
>>>>          {mochiweb_http,headers,4}]
>>>>
>>>> [Wed, 28 Jan 2009 14:29:53 GMT] [debug] [<0.62.0>] HTTPd 500 error
>>>> response:
>>>> {"error":"error","reason":"normal"}
>>>>
>>>> Checking the status of couch on .194 at this point shows that the
>>>> process has died
>>>>
>>>> repeated attempts fail, on different documents
>>>>
>>>>
>>>> regards,
>>>>
>>>> Jeff
>>>
>>>
>

Mime
View raw message