Ok, I've got a little further. If I change my test to much short runs
(even 10 documents), I can reproduce the connection refused symptom
and the stacktrace I pasted originally in under a minute, every time.
What appears to be happening is that the couch_uuids gen_server is
failing (being restarted too frequently), part of the supervision tree
is torn down and rebuilt, and a concurrent write operation fails while
that is happening. Since I'm pretty sure that's not what should happen
with Erlang/OTP, it's hopefully a straightforward bug.
Alas, my test client is in Java (using httpclient 4.0, fwiw), so I
can't easily post a unit test for this right now.
B.
On Sat, Oct 3, 2009 at 1:52 PM, Robert Newson <robert.newson@gmail.com> wrote:
> A subsequent run that encountered the connection refused error did not
> cause the couch_uuids supervisor to restart it, so the two problems
> are unrelated.
>
> On Sat, Oct 3, 2009 at 1:50 PM, Robert Newson <robert.newson@gmail.com> wrote:
>> Hi,
>>
>> Jan suggested I start a thread on dev about a problem I'm encountering
>> on couchdb trunk. I'm performing long running insertion tests (that
>> is, millions of inserts) in order to quantify the differences between
>> batch vs. sync and random identifiers vs. sequential ones. I find it
>> hard to complete a 5 million insertion run as my client eventually
>> (and randomly) gets a "connection refused" error from couchdb.
>> Immediately after that occurs, I can successfully hit couchdb with
>> curl, so it's transitory. I found the following errors in the log
>> around the time of the problem;
>>
>> =SUPERVISOR REPORT==== 3-Oct-2009::13:32:18 ===
>> Supervisor: {local,couch_secondary_services}
>> Context: shutdown
>> Reason: reached_max_restart_intensity
>> Offender: [{pid,<0.5273.0>},
>> {name,uuids},
>> {mfa,{couch_uuids,start,[]}},
>> {restart_type,permanent},
>> {shutdown,brutal_kill},
>> {child_type,worker}]
>>
>> [error] [<0.76.0>] {error_report,<0.30.0>,
>> {<0.76.0>,supervisor_report,
>> [{supervisor,{local,couch_server_sup}},
>> {errorContext,child_terminated},
>> {reason,shutdown},
>> {offender,
>> [{pid,<0.2218.0>},
>> {name,couch_secondary_services},
>> {mfa,{couch_server_sup,start_secondary_services,[]}},
>> {restart_type,permanent},
>> {shutdown,infinity},
>> {child_type,supervisor}]}]}}
>>
>> =SUPERVISOR REPORT==== 3-Oct-2009::13:32:18 ===
>> Supervisor: {local,couch_server_sup}
>> Context: child_terminated
>> Reason: shutdown
>> Offender: [{pid,<0.2218.0>},
>> {name,couch_secondary_services},
>> {mfa,{couch_server_sup,start_secondary_services,[]}},
>> {restart_type,permanent},
>> {shutdown,infinity},
>> {child_type,supervisor}]
>>
>>
>> =ERROR REPORT==== 3-Oct-2009::13:32:18 ===
>> Error in process <0.5316.0> with exit value:
>> {badarg,[{ets,insert,[stats_hit_table,{{couchdb,open_databases},-1}]},{couch_stats_collector,decrement,1}]}
>>
>>
>> =ERROR REPORT==== 3-Oct-2009::13:32:18 ===
>> Error in process <0.5312.0> with exit value:
>> {badarg,[{ets,insert,[stats_hit_table,{{couchdb,open_os_files},-1}]},{couch_stats_collector,decrement,1}]}
>>
>
|