incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paolo Negri <paolo.ne...@wooga.net>
Subject Re: timeout hitting a database url after launching compaction
Date Mon, 17 Oct 2011 12:44:17 GMT
On Mon, Oct 17, 2011 at 2:30 PM, Robert Newson <rnewson@apache.org> wrote:
> Do you have the full stacktrace from couch.log?

I pasted it here https://gist.github.com/1292529

>
> On 17 October 2011 13:04, Paolo Negri <paolo.negri@wooga.net> wrote:
>> On Mon, Oct 17, 2011 at 1:57 PM, Robert Newson <rnewson@apache.org> wrote:
>>> Compaction is an online process, there should be no expectation of 500
>>> responses before, during, or after compaction.
>>>
>>> In this case, it seems the couch_server process is blocked for more
>>> than five seconds performing I/O and the gen_server:call from
>>> couch_server:open times out. This timeout has been increased to
>>> infinity since 1.0.0.
>>>
>>> What version are you running?
>>
>> I compiled master from github here are the details
>>
>> "CouchDB/1.3.0a-74613f5-git (Erlang OTP/R14B04)"},
>>
>> The reason to use master is that we wanted to benefit from the
>> ejson/snappy adoption so I guess I could actually also use the 1.2
>> branch
>>
>> Paolo
>>
>>>
>>> B.
>>>
>>> On 17 October 2011 12:05, Martin Hewitt <martin@thenoi.se> wrote:
>>>> I disagree, it makes sense as the 5xx error code range is for responses where
the server can't fulfil a well-formed, valid client request.
>>>>
>>>> Your GET is well-formed, but the server can't process it as it's working
on the previous action, so a 500 is perfectly valid. Perhaps a 503 would be more accurate,
but the 5xx prefix is certainly correct.
>>>>
>>>> Martin
>>>>
>>>> Sent from my iPhone
>>>>
>>>> On 17 Oct 2011, at 09:29, Paolo Negri <paolo.negri@wooga.net> wrote:
>>>>
>>>>> I agree on the fact that what happens is pretty clear to explain, I
>>>>> still thought it would be useful for the developers to know since
>>>>> offering a 500 status code for a known system condition is probably
>>>>> something that can be improved.
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Paolo
>>>>>
>>>>> On Mon, Oct 17, 2011 at 10:24 AM, CGS <cgsmcmlxxv@gmail.com> wrote:
>>>>>> I am not developer, but it's quite logic, I may say. Once you started
the
>>>>>> compaction, your CouchDB is not responsive while the database is
preparing
>>>>>> for compaction. Triggering immediately GET, the web instance responds
with
>>>>>> status code 500 (internal server error, meaning unresponsive server
in this
>>>>>> case). So, nothing unusual in my opinion.
>>>>>>
>>>>>> Cheers,
>>>>>> CGS
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 10/17/2011 09:57 AM, Paolo Negri wrote:
>>>>>>>
>>>>>>> IO activity is not monitored, there's only one db on the couchdb
>>>>>>> instance and the described job is the only activity executed
on this
>>>>>>> machine.
>>>>>>> Delaying the first request on the database url by 30 seconds
did
>>>>>>> actually prevent the problem from happening again.
>>>>>>> So the issue seems to happen specifically at the moment right
after
>>>>>>> compaction is started.
>>>>>>> The database is about 7GB big once compressed, the server is
hosted on
>>>>>>> ec2 with the database directory placed on his own dedicated ephemeral
>>>>>>> storage.
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Paolo
>>>>>>>
>>>>>>> On Fri, Oct 14, 2011 at 9:05 PM, Paul Davis<paul.joseph.davis@gmail.com>
>>>>>>>  wrote:
>>>>>>>>
>>>>>>>> Do you monitor IO activity or system responsiveness when
you're doing
>>>>>>>> this. I've seen some compactions wallop a system when it
switches over
>>>>>>>> due to removing large old files and such. It doesn't sound
like this
>>>>>>>> is big enough for that case but it might be something worth
checking.
>>>>>>>>
>>>>>>>> On Fri, Oct 14, 2011 at 3:41 AM, Paolo Negri<paolo.negri@wooga.net>
>>>>>>>>  wrote:
>>>>>>>>>
>>>>>>>>> Dear list,
>>>>>>>>>
>>>>>>>>> We have a script that does the following (strictly sequentially)
>>>>>>>>>
>>>>>>>>> 1) update 300K docs in a db
>>>>>>>>> 2) launch compaction of the db
>>>>>>>>> 3) poll at a 30 sec frequency http://127.0.0.1:5984/database
to know
>>>>>>>>> when compaction completed
>>>>>>>>>
>>>>>>>>> Last night we got a timeout error during 3, we think
that this might
>>>>>>>>> be because the first polling (GET  http://127.0.0.1:5984/database)
is
>>>>>>>>> done right after triggering compaction
>>>>>>>>>
>>>>>>>>> I thought the dev team might be interested in knowing
that this is
>>>>>>>>> happening
>>>>>>>>>
>>>>>>>>> There's no other activity on the couchdb instance other
than what
>>>>>>>>> described in this email.
>>>>>>>>>
>>>>>>>>> ERROR unexpectd response checking compaction db: {ok,"500",
>>>>>>>>>                                    
            [{"Server",
>>>>>>>>>
>>>>>>>>> "CouchDB/1.3.0a-74613f5-git (Erlang OTP/R14B04)"},
>>>>>>>>>                                    
             {"Date",
>>>>>>>>>                                    
              "Fri, 14 Oct 2011
>>>>>>>>> 01:46:37 GMT"},
>>>>>>>>>                                    
             {"Content-Type",
>>>>>>>>>                                    
              "text/plain;
>>>>>>>>> charset=utf-8"},
>>>>>>>>>
>>>>>>>>>  {"Content-Length","350"},
>>>>>>>>>                                    
             {"Cache-Control",
>>>>>>>>>                                    
              "must-revalidate"}],
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> <<"{\"error\":\"{timeout,{gen_server,call,[<0.21934.9>,{open_ref_count,<0.4090.13>}]}}\",\"reason\":\"{gen_server,call,\\n
>>>>>>>>>   [couch_server,\\n     {open,<<\\\"backup\\\">>,\\n
>>>>>>>>> [{user_ctx,\\n              {user_ctx,null,\\n
>>>>>>>>> [<<\\\"_admin\\\">>],\\n<<\\\"{couch_httpd_auth,
>>>>>>>>> default_authentication_handler}\\\">>}}]},\\n  
  infinity]}\"}\n">>}
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>>
>>>>>>>>> Paolo
>>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Engineering
>>>>> http://www.wooga.com | phone +49-30-8962 5058  | fax +49-30-8964 9064
>>>>>
>>>>> wooga GmbH | Saarbruecker Str. 38 | 10405 Berlin | Germany
>>>>> Sitz der Gesellschaft: Berlin; HRB 117846 B
>>>>> Registergericht Berlin-Charlottenburg
>>>>> Geschaeftsfuehrung: Jens Begemann, Philipp Moeser
>>>>
>>>
>>
>>
>>
>> --
>> Engineering
>> http://www.wooga.com | phone +49-30-8962 5058  | fax +49-30-8964 9064
>>
>> wooga GmbH | Saarbruecker Str. 38 | 10405 Berlin | Germany
>> Sitz der Gesellschaft: Berlin; HRB 117846 B
>> Registergericht Berlin-Charlottenburg
>> Geschaeftsfuehrung: Jens Begemann, Philipp Moeser
>>
>



-- 
Engineering
http://www.wooga.com | phone +49-30-8962 5058  | fax +49-30-8964 9064

wooga GmbH | Saarbruecker Str. 38 | 10405 Berlin | Germany
Sitz der Gesellschaft: Berlin; HRB 117846 B
Registergericht Berlin-Charlottenburg
Geschaeftsfuehrung: Jens Begemann, Philipp Moeser

Mime
View raw message