incubator-couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Newson <robert.new...@gmail.com>
Subject Re: Connection refused when inserting document and reached_max_restart_intensity in the log.
Date Sat, 03 Oct 2009 14:35:00 GMT
This is now COUCHDB-517 and includes a patch that fixes the problem.

B.

On Sat, Oct 3, 2009 at 2:34 PM, Robert Newson <robert.newson@gmail.com> wrote:
> I should point out that my test does this;
>
> 1) PUT _config/uuid/algorithm with "random"
> 2) insert some documents
> 3) PUT _config/uuid/algorithm with "sequential"
> 4) insert some documents
>
> If you loop that, and insert as few as 10 documents at 2) and 4), you
> will get a connection refused and the stacktrace output, within 60
> seconds.
>
>
> On Sat, Oct 3, 2009 at 2:33 PM, Robert Newson <robert.newson@gmail.com> wrote:
>> Ok, I've got a little further. If I change my test to much short runs
>> (even 10 documents), I can reproduce the connection refused symptom
>> and the stacktrace I pasted originally in under a minute, every time.
>>
>> What appears to be happening is that the couch_uuids gen_server is
>> failing (being restarted too frequently), part of the supervision tree
>> is torn down and rebuilt, and a concurrent write operation fails while
>> that is happening. Since I'm pretty sure that's not what should happen
>> with Erlang/OTP, it's hopefully a straightforward bug.
>>
>> Alas, my test client is in Java (using httpclient 4.0, fwiw), so I
>> can't easily post a unit test for this right now.
>>
>> B.
>>
>> On Sat, Oct 3, 2009 at 1:52 PM, Robert Newson <robert.newson@gmail.com> wrote:
>>> A subsequent run that encountered the connection refused error did not
>>> cause the couch_uuids supervisor to restart it, so the two problems
>>> are unrelated.
>>>
>>> On Sat, Oct 3, 2009 at 1:50 PM, Robert Newson <robert.newson@gmail.com>
wrote:
>>>> Hi,
>>>>
>>>> Jan suggested I start a thread on dev about a problem I'm encountering
>>>> on couchdb trunk. I'm performing long running insertion tests (that
>>>> is, millions of inserts) in order to quantify the differences between
>>>> batch vs. sync and random identifiers vs. sequential ones. I find it
>>>> hard to complete a 5 million insertion run as my client eventually
>>>> (and randomly) gets a "connection refused" error from couchdb.
>>>> Immediately after that occurs, I can successfully hit couchdb with
>>>> curl, so it's transitory. I found the following errors in the log
>>>> around the time of the problem;
>>>>
>>>> =SUPERVISOR REPORT==== 3-Oct-2009::13:32:18 ===
>>>>     Supervisor: {local,couch_secondary_services}
>>>>     Context:    shutdown
>>>>     Reason:     reached_max_restart_intensity
>>>>     Offender:   [{pid,<0.5273.0>},
>>>>                  {name,uuids},
>>>>                  {mfa,{couch_uuids,start,[]}},
>>>>                  {restart_type,permanent},
>>>>                  {shutdown,brutal_kill},
>>>>                  {child_type,worker}]
>>>>
>>>> [error] [<0.76.0>] {error_report,<0.30.0>,
>>>>    {<0.76.0>,supervisor_report,
>>>>     [{supervisor,{local,couch_server_sup}},
>>>>      {errorContext,child_terminated},
>>>>      {reason,shutdown},
>>>>      {offender,
>>>>          [{pid,<0.2218.0>},
>>>>           {name,couch_secondary_services},
>>>>           {mfa,{couch_server_sup,start_secondary_services,[]}},
>>>>           {restart_type,permanent},
>>>>           {shutdown,infinity},
>>>>           {child_type,supervisor}]}]}}
>>>>
>>>> =SUPERVISOR REPORT==== 3-Oct-2009::13:32:18 ===
>>>>     Supervisor: {local,couch_server_sup}
>>>>     Context:    child_terminated
>>>>     Reason:     shutdown
>>>>     Offender:   [{pid,<0.2218.0>},
>>>>                  {name,couch_secondary_services},
>>>>                  {mfa,{couch_server_sup,start_secondary_services,[]}},
>>>>                  {restart_type,permanent},
>>>>                  {shutdown,infinity},
>>>>                  {child_type,supervisor}]
>>>>
>>>>
>>>> =ERROR REPORT==== 3-Oct-2009::13:32:18 ===
>>>> Error in process <0.5316.0> with exit value:
>>>> {badarg,[{ets,insert,[stats_hit_table,{{couchdb,open_databases},-1}]},{couch_stats_collector,decrement,1}]}
>>>>
>>>>
>>>> =ERROR REPORT==== 3-Oct-2009::13:32:18 ===
>>>> Error in process <0.5312.0> with exit value:
>>>> {badarg,[{ets,insert,[stats_hit_table,{{couchdb,open_os_files},-1}]},{couch_stats_collector,decrement,1}]}
>>>>
>>>
>>
>

Mime
View raw message