couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jan Lehnardt <...@apache.org>
Subject Re: {error,emfile} on CouchDB 1.2.x
Date Sun, 18 Mar 2012 20:48:50 GMT

On Mar 18, 2012, at 21:46 , Randall Leeds wrote:

> On Sun, Mar 18, 2012 at 13:39, Jan Lehnardt <jan@apache.org> wrote:
> 
>> 
>> On Mar 18, 2012, at 21:28 , Randall Leeds wrote:
>> 
>>> On Sun, Mar 18, 2012 at 11:08, Stefan Kögl <koeglstefan@gmail.com>
>> wrote:
>>> 
>>>> Hi,
>>>> 
>>>> Another thing I noticed during my tests of CouchDB 1.2.x. I redirected
>>>> live traffic to the instance and after a rather short time, requests
>>>> were failing with the following information in the logs:
>>>> 
>>>> 
>>>> [Sun, 18 Mar 2012 16:39:24 GMT] [error] [<0.27554.2>]
>>>> {error_report,<0.31.0>,
>>>>                                  {<0.27554.2>,std_error,
>>>>                                   [{application,mochiweb},
>>>>                                    "Accept failed error",
>>>>                                    "{error,emfile}"]}}
>>>> [Sun, 18 Mar 2012 16:39:24 GMT] [error] [<0.27554.2>]
>>>> {error_report,<0.31.0>,
>>>>                        {<0.27554.2>,crash_report,
>>>>                         [[{initial_call,
>>>>                               {mochiweb_acceptor,init,
>>>>                                   ['Argument__1','Argument__2',
>>>>                                    'Argument__3']}},
>>>>                           {pid,<0.27554.2>},
>>>>                           {registered_name,[]},
>>>>                           {error_info,
>>>>                               {exit,
>>>>                                   {error,accept_failed},
>>>>                                   [{mochiweb_acceptor,init,3},
>>>>                                    {proc_lib,init_p_do_apply,3}]}},
>>>>                           {ancestors,
>>>>                               [couch_httpd,couch_secondary_services,
>>>>                                couch_server_sup,<0.32.0>]},
>>>>                           {messages,[]},
>>>>                           {links,[<0.129.0>]},
>>>>                           {dictionary,[]},
>>>>                           {trap_exit,false},
>>>>                           {status,running},
>>>>                           {heap_size,233},
>>>>                           {stack_size,24},
>>>>                           {reductions,244}],
>>>>                          []]}}
>>>> 
>>>> 
>>>> I think "emfile" means that CouchDB (or mochiweb?) couldn't open any
>>>> more files / connections. I've set the (hard and soft) nofile limit for
>>>> user couchdb to 4096, but didn't raise the ERL_MAX_PORTS accordingly.
>>>> Anyway, as soon as the error occured, CouchDB started writing most of my
>>>> view files from scratch, rendering the instance unusable.
>>>> 
>>>> I'd expect CouchDB to fail more gracefully when the maximum number of
>>>> open files is reached. Is this a bug or expected behaviour?
>>>> 
>>> 
>>> Looks like a bug. Whenever there's a problem opening a view file,
>>> couch_view tries to delete it. Clearly, this is not the right course of
>>> action when the problem is due to emfile.
>> 
>> This looks rather serious. I opened a JIRA:
>> 
>> https://issues.apache.org/jira/browse/COUCHDB-1445
>> 
>> And started collecting the info. Bob N's message came in in the meantime
>> and I agree, we should see if there's more cases where we need to be
>> careful.
>> 
>> Also, I'd consider this blocking for 1.2.0.
>> 
>> Anyone who can pitch in with their expertise is more than welcome! :)
>> 
> 
> Assigned to me. Patch forthcoming. Agree in should block 1.2.0, especially
> because upgrades are the sort of things where bad packaging downstream
> might cause custom ERL_MAX_PORTS settings to be overwritten and we wouldn't
> want anyone's production to have its views erased needlessly.

Thanks for taking this on Randall!

Cheers
Jan
-- 

> 
> -Randall
> 
> 
>> 
>> Cheers
>> Jan
>> --
>> 
>> 
>>> 
>>> Here's a patch that I propose might fix it. I'd like to hear from another
>>> dev on this, or if there's a better way we should bail out.
>>> 
>>> diff --git a/src/couchdb/couch_view_group.erl
>>> b/src/couchdb/couch_view_group.erl
>>> index 97fc512..ab075bd 100644
>>> --- a/src/couchdb/couch_view_group.erl
>>> +++ b/src/couchdb/couch_view_group.erl
>>> @@ -469,6 +469,10 @@ open_index_file(RootDir, DbName, GroupSig) ->
>>>    case couch_file:open(FileName) of
>>>    {ok, Fd}        -> {ok, Fd};
>>>    {error, enoent} -> couch_file:open(FileName, [create]);
>>> +    {error, emfile} ->
>>> +        ?LOG_ERROR("Could not open file for view index: max open files
>>> reached. "
>>> +                   "Raise ERL_MAX_PORTS or system limits.", []),
>>> +        throw({error, emfile});
>>>    Error           -> Error
>>>    end.
>> 
>> 


Mime
View raw message