Return-Path: X-Original-To: apmail-couchdb-dev-archive@www.apache.org Delivered-To: apmail-couchdb-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 3B38C95FA for ; Sun, 18 Mar 2012 20:34:55 +0000 (UTC) Received: (qmail 91621 invoked by uid 500); 18 Mar 2012 20:34:54 -0000 Delivered-To: apmail-couchdb-dev-archive@couchdb.apache.org Received: (qmail 91588 invoked by uid 500); 18 Mar 2012 20:34:54 -0000 Mailing-List: contact dev-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@couchdb.apache.org Delivered-To: mailing list dev@couchdb.apache.org Received: (qmail 91580 invoked by uid 99); 18 Mar 2012 20:34:54 -0000 Received: from minotaur.apache.org (HELO minotaur.apache.org) (140.211.11.9) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 18 Mar 2012 20:34:54 +0000 Received: from localhost (HELO mail-iy0-f180.google.com) (127.0.0.1) (smtp-auth username rnewson, mechanism plain) by minotaur.apache.org (qpsmtpd/0.29) with ESMTP; Sun, 18 Mar 2012 20:34:54 +0000 Received: by iage36 with SMTP id e36so11237059iag.11 for ; Sun, 18 Mar 2012 13:34:53 -0700 (PDT) MIME-Version: 1.0 Received: by 10.42.97.196 with SMTP id p4mr5562263icn.22.1332102893707; Sun, 18 Mar 2012 13:34:53 -0700 (PDT) Received: by 10.42.99.195 with HTTP; Sun, 18 Mar 2012 13:34:53 -0700 (PDT) In-Reply-To: References: <4F6624AA.7050202@gmail.com> Date: Sun, 18 Mar 2012 20:34:53 +0000 Message-ID: Subject: Re: {error,emfile} on CouchDB 1.2.x From: Robert Newson To: dev@couchdb.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable I'd rather improve the handling in prepare_group/4, I'd hope we could explicitly enumerate the cases where deleting the view index is a reasonable response than do it quite this capriciously. B. On 18 March 2012 20:28, Randall Leeds wrote: > On Sun, Mar 18, 2012 at 11:08, Stefan K=F6gl wrot= e: > >> Hi, >> >> Another thing I noticed during my tests of CouchDB 1.2.x. I redirected >> live traffic to the instance and after a rather short time, requests >> were failing with the following information in the logs: >> >> >> [Sun, 18 Mar 2012 16:39:24 GMT] [error] [<0.27554.2>] >> {error_report,<0.31.0>, >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0{= <0.27554.2>,std_error, >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 = [{application,mochiweb}, >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 = =A0"Accept failed error", >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 = =A0"{error,emfile}"]}} >> [Sun, 18 Mar 2012 16:39:24 GMT] [error] [<0.27554.2>] >> {error_report,<0.31.0>, >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0{<0.27554.2>,crash_re= port, >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 [[{initial_call, >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 {mochiwe= b_acceptor,init, >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 = ['Argument__1','Argument__2', >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 = =A0'Argument__3']}}, >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 {pid,<0.27554.2>= }, >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 {registered_name= ,[]}, >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 {error_info, >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 {exit, >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 = {error,accept_failed}, >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 = [{mochiweb_acceptor,init,3}, >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 = =A0{proc_lib,init_p_do_apply,3}]}}, >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 {ancestors, >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 [couch_h= ttpd,couch_secondary_services, >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0couch= _server_sup,<0.32.0>]}, >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 {messages,[]}, >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 {links,[<0.129.0= >]}, >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 {dictionary,[]}, >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 {trap_exit,false= }, >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 {status,running}= , >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 {heap_size,233}, >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 {stack_size,24}, >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 {reductions,244}= ], >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0[]]}} >> >> >> I think "emfile" means that CouchDB (or mochiweb?) couldn't open any >> more files / connections. I've set the (hard and soft) nofile limit for >> user couchdb to 4096, but didn't raise the ERL_MAX_PORTS accordingly. >> Anyway, as soon as the error occured, CouchDB started writing most of my >> view files from scratch, rendering the instance unusable. >> >> I'd expect CouchDB to fail more gracefully when the maximum number of >> open files is reached. Is this a bug or expected behaviour? >> > > Looks like a bug. Whenever there's a problem opening a view file, > couch_view tries to delete it. Clearly, this is not the right course of > action when the problem is due to emfile. > > Here's a patch that I propose might fix it. I'd like to hear from another > dev on this, or if there's a better way we should bail out. > > diff --git a/src/couchdb/couch_view_group.erl > b/src/couchdb/couch_view_group.erl > index 97fc512..ab075bd 100644 > --- a/src/couchdb/couch_view_group.erl > +++ b/src/couchdb/couch_view_group.erl > @@ -469,6 +469,10 @@ open_index_file(RootDir, DbName, GroupSig) -> > =A0 =A0 case couch_file:open(FileName) of > =A0 =A0 {ok, Fd} =A0 =A0 =A0 =A0-> {ok, Fd}; > =A0 =A0 {error, enoent} -> couch_file:open(FileName, [create]); > + =A0 =A0{error, emfile} -> > + =A0 =A0 =A0 =A0?LOG_ERROR("Could not open file for view index: max open= files > reached. " > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 "Raise ERL_MAX_PORTS or system limi= ts.", []), > + =A0 =A0 =A0 =A0throw({error, emfile}); > =A0 =A0 Error =A0 =A0 =A0 =A0 =A0 -> Error > =A0 =A0 end.