incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Newson <rnew...@apache.org>
Subject Re: timeout hitting a database url after launching compaction
Date Mon, 17 Oct 2011 11:57:51 GMT
Compaction is an online process, there should be no expectation of 500
responses before, during, or after compaction.

In this case, it seems the couch_server process is blocked for more
than five seconds performing I/O and the gen_server:call from
couch_server:open times out. This timeout has been increased to
infinity since 1.0.0.

What version are you running?

B.

On 17 October 2011 12:05, Martin Hewitt <martin@thenoi.se> wrote:
> I disagree, it makes sense as the 5xx error code range is for responses where the server
can't fulfil a well-formed, valid client request.
>
> Your GET is well-formed, but the server can't process it as it's working on the previous
action, so a 500 is perfectly valid. Perhaps a 503 would be more accurate, but the 5xx prefix
is certainly correct.
>
> Martin
>
> Sent from my iPhone
>
> On 17 Oct 2011, at 09:29, Paolo Negri <paolo.negri@wooga.net> wrote:
>
>> I agree on the fact that what happens is pretty clear to explain, I
>> still thought it would be useful for the developers to know since
>> offering a 500 status code for a known system condition is probably
>> something that can be improved.
>>
>> Thanks,
>>
>> Paolo
>>
>> On Mon, Oct 17, 2011 at 10:24 AM, CGS <cgsmcmlxxv@gmail.com> wrote:
>>> I am not developer, but it's quite logic, I may say. Once you started the
>>> compaction, your CouchDB is not responsive while the database is preparing
>>> for compaction. Triggering immediately GET, the web instance responds with
>>> status code 500 (internal server error, meaning unresponsive server in this
>>> case). So, nothing unusual in my opinion.
>>>
>>> Cheers,
>>> CGS
>>>
>>>
>>>
>>>
>>> On 10/17/2011 09:57 AM, Paolo Negri wrote:
>>>>
>>>> IO activity is not monitored, there's only one db on the couchdb
>>>> instance and the described job is the only activity executed on this
>>>> machine.
>>>> Delaying the first request on the database url by 30 seconds did
>>>> actually prevent the problem from happening again.
>>>> So the issue seems to happen specifically at the moment right after
>>>> compaction is started.
>>>> The database is about 7GB big once compressed, the server is hosted on
>>>> ec2 with the database directory placed on his own dedicated ephemeral
>>>> storage.
>>>>
>>>> Thanks,
>>>>
>>>> Paolo
>>>>
>>>> On Fri, Oct 14, 2011 at 9:05 PM, Paul Davis<paul.joseph.davis@gmail.com>
>>>>  wrote:
>>>>>
>>>>> Do you monitor IO activity or system responsiveness when you're doing
>>>>> this. I've seen some compactions wallop a system when it switches over
>>>>> due to removing large old files and such. It doesn't sound like this
>>>>> is big enough for that case but it might be something worth checking.
>>>>>
>>>>> On Fri, Oct 14, 2011 at 3:41 AM, Paolo Negri<paolo.negri@wooga.net>
>>>>>  wrote:
>>>>>>
>>>>>> Dear list,
>>>>>>
>>>>>> We have a script that does the following (strictly sequentially)
>>>>>>
>>>>>> 1) update 300K docs in a db
>>>>>> 2) launch compaction of the db
>>>>>> 3) poll at a 30 sec frequency http://127.0.0.1:5984/database to know
>>>>>> when compaction completed
>>>>>>
>>>>>> Last night we got a timeout error during 3, we think that this might
>>>>>> be because the first polling (GET  http://127.0.0.1:5984/database)
is
>>>>>> done right after triggering compaction
>>>>>>
>>>>>> I thought the dev team might be interested in knowing that this is
>>>>>> happening
>>>>>>
>>>>>> There's no other activity on the couchdb instance other than what
>>>>>> described in this email.
>>>>>>
>>>>>> ERROR unexpectd response checking compaction db: {ok,"500",
>>>>>>                                            
    [{"Server",
>>>>>>
>>>>>> "CouchDB/1.3.0a-74613f5-git (Erlang OTP/R14B04)"},
>>>>>>                                            
     {"Date",
>>>>>>                                            
      "Fri, 14 Oct 2011
>>>>>> 01:46:37 GMT"},
>>>>>>                                            
     {"Content-Type",
>>>>>>                                            
      "text/plain;
>>>>>> charset=utf-8"},
>>>>>>
>>>>>>  {"Content-Length","350"},
>>>>>>                                            
     {"Cache-Control",
>>>>>>                                            
      "must-revalidate"}],
>>>>>>
>>>>>>
>>>>>> <<"{\"error\":\"{timeout,{gen_server,call,[<0.21934.9>,{open_ref_count,<0.4090.13>}]}}\",\"reason\":\"{gen_server,call,\\n
>>>>>>   [couch_server,\\n     {open,<<\\\"backup\\\">>,\\n
>>>>>> [{user_ctx,\\n              {user_ctx,null,\\n
>>>>>> [<<\\\"_admin\\\">>],\\n<<\\\"{couch_httpd_auth,
>>>>>> default_authentication_handler}\\\">>}}]},\\n     infinity]}\"}\n">>}
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Paolo
>>>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>>
>> --
>> Engineering
>> http://www.wooga.com | phone +49-30-8962 5058  | fax +49-30-8964 9064
>>
>> wooga GmbH | Saarbruecker Str. 38 | 10405 Berlin | Germany
>> Sitz der Gesellschaft: Berlin; HRB 117846 B
>> Registergericht Berlin-Charlottenburg
>> Geschaeftsfuehrung: Jens Begemann, Philipp Moeser
>

Mime
View raw message