incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paolo Negri <paolo.ne...@wooga.net>
Subject Re: timeout hitting a database url after launching compaction
Date Mon, 17 Oct 2011 12:04:38 GMT
On Mon, Oct 17, 2011 at 1:57 PM, Robert Newson <rnewson@apache.org> wrote:
> Compaction is an online process, there should be no expectation of 500
> responses before, during, or after compaction.
>
> In this case, it seems the couch_server process is blocked for more
> than five seconds performing I/O and the gen_server:call from
> couch_server:open times out. This timeout has been increased to
> infinity since 1.0.0.
>
> What version are you running?

I compiled master from github here are the details

"CouchDB/1.3.0a-74613f5-git (Erlang OTP/R14B04)"},

The reason to use master is that we wanted to benefit from the
ejson/snappy adoption so I guess I could actually also use the 1.2
branch

Paolo

>
> B.
>
> On 17 October 2011 12:05, Martin Hewitt <martin@thenoi.se> wrote:
>> I disagree, it makes sense as the 5xx error code range is for responses where the
server can't fulfil a well-formed, valid client request.
>>
>> Your GET is well-formed, but the server can't process it as it's working on the previous
action, so a 500 is perfectly valid. Perhaps a 503 would be more accurate, but the 5xx prefix
is certainly correct.
>>
>> Martin
>>
>> Sent from my iPhone
>>
>> On 17 Oct 2011, at 09:29, Paolo Negri <paolo.negri@wooga.net> wrote:
>>
>>> I agree on the fact that what happens is pretty clear to explain, I
>>> still thought it would be useful for the developers to know since
>>> offering a 500 status code for a known system condition is probably
>>> something that can be improved.
>>>
>>> Thanks,
>>>
>>> Paolo
>>>
>>> On Mon, Oct 17, 2011 at 10:24 AM, CGS <cgsmcmlxxv@gmail.com> wrote:
>>>> I am not developer, but it's quite logic, I may say. Once you started the
>>>> compaction, your CouchDB is not responsive while the database is preparing
>>>> for compaction. Triggering immediately GET, the web instance responds with
>>>> status code 500 (internal server error, meaning unresponsive server in this
>>>> case). So, nothing unusual in my opinion.
>>>>
>>>> Cheers,
>>>> CGS
>>>>
>>>>
>>>>
>>>>
>>>> On 10/17/2011 09:57 AM, Paolo Negri wrote:
>>>>>
>>>>> IO activity is not monitored, there's only one db on the couchdb
>>>>> instance and the described job is the only activity executed on this
>>>>> machine.
>>>>> Delaying the first request on the database url by 30 seconds did
>>>>> actually prevent the problem from happening again.
>>>>> So the issue seems to happen specifically at the moment right after
>>>>> compaction is started.
>>>>> The database is about 7GB big once compressed, the server is hosted on
>>>>> ec2 with the database directory placed on his own dedicated ephemeral
>>>>> storage.
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Paolo
>>>>>
>>>>> On Fri, Oct 14, 2011 at 9:05 PM, Paul Davis<paul.joseph.davis@gmail.com>
>>>>>  wrote:
>>>>>>
>>>>>> Do you monitor IO activity or system responsiveness when you're doing
>>>>>> this. I've seen some compactions wallop a system when it switches
over
>>>>>> due to removing large old files and such. It doesn't sound like this
>>>>>> is big enough for that case but it might be something worth checking.
>>>>>>
>>>>>> On Fri, Oct 14, 2011 at 3:41 AM, Paolo Negri<paolo.negri@wooga.net>
>>>>>>  wrote:
>>>>>>>
>>>>>>> Dear list,
>>>>>>>
>>>>>>> We have a script that does the following (strictly sequentially)
>>>>>>>
>>>>>>> 1) update 300K docs in a db
>>>>>>> 2) launch compaction of the db
>>>>>>> 3) poll at a 30 sec frequency http://127.0.0.1:5984/database
to know
>>>>>>> when compaction completed
>>>>>>>
>>>>>>> Last night we got a timeout error during 3, we think that this
might
>>>>>>> be because the first polling (GET  http://127.0.0.1:5984/database)
is
>>>>>>> done right after triggering compaction
>>>>>>>
>>>>>>> I thought the dev team might be interested in knowing that this
is
>>>>>>> happening
>>>>>>>
>>>>>>> There's no other activity on the couchdb instance other than
what
>>>>>>> described in this email.
>>>>>>>
>>>>>>> ERROR unexpectd response checking compaction db: {ok,"500",
>>>>>>>                                          
      [{"Server",
>>>>>>>
>>>>>>> "CouchDB/1.3.0a-74613f5-git (Erlang OTP/R14B04)"},
>>>>>>>                                          
       {"Date",
>>>>>>>                                          
        "Fri, 14 Oct 2011
>>>>>>> 01:46:37 GMT"},
>>>>>>>                                          
       {"Content-Type",
>>>>>>>                                          
        "text/plain;
>>>>>>> charset=utf-8"},
>>>>>>>
>>>>>>>  {"Content-Length","350"},
>>>>>>>                                          
       {"Cache-Control",
>>>>>>>                                          
        "must-revalidate"}],
>>>>>>>
>>>>>>>
>>>>>>> <<"{\"error\":\"{timeout,{gen_server,call,[<0.21934.9>,{open_ref_count,<0.4090.13>}]}}\",\"reason\":\"{gen_server,call,\\n
>>>>>>>   [couch_server,\\n     {open,<<\\\"backup\\\">>,\\n
>>>>>>> [{user_ctx,\\n              {user_ctx,null,\\n
>>>>>>> [<<\\\"_admin\\\">>],\\n<<\\\"{couch_httpd_auth,
>>>>>>> default_authentication_handler}\\\">>}}]},\\n     infinity]}\"}\n">>}
>>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Paolo
>>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>>
>>> --
>>> Engineering
>>> http://www.wooga.com | phone +49-30-8962 5058  | fax +49-30-8964 9064
>>>
>>> wooga GmbH | Saarbruecker Str. 38 | 10405 Berlin | Germany
>>> Sitz der Gesellschaft: Berlin; HRB 117846 B
>>> Registergericht Berlin-Charlottenburg
>>> Geschaeftsfuehrung: Jens Begemann, Philipp Moeser
>>
>



-- 
Engineering
http://www.wooga.com | phone +49-30-8962 5058  | fax +49-30-8964 9064

wooga GmbH | Saarbruecker Str. 38 | 10405 Berlin | Germany
Sitz der Gesellschaft: Berlin; HRB 117846 B
Registergericht Berlin-Charlottenburg
Geschaeftsfuehrung: Jens Begemann, Philipp Moeser

Mime
View raw message