Return-Path: Delivered-To: apmail-couchdb-user-archive@www.apache.org Received: (qmail 88865 invoked from network); 20 May 2009 12:47:55 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 20 May 2009 12:47:55 -0000 Received: (qmail 79382 invoked by uid 500); 20 May 2009 12:48:07 -0000 Delivered-To: apmail-couchdb-user-archive@couchdb.apache.org Received: (qmail 79296 invoked by uid 500); 20 May 2009 12:48:07 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 79286 invoked by uid 99); 20 May 2009 12:48:07 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 20 May 2009 12:48:07 +0000 X-ASF-Spam-Status: No, hits=0.2 required=10.0 tests=RCVD_IN_DNSWL_LOW,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (athena.apache.org: local policy) Received: from [209.68.5.9] (HELO relay00.pair.com) (209.68.5.9) by apache.org (qpsmtpd/0.29) with SMTP; Wed, 20 May 2009 12:47:58 +0000 Received: (qmail 86898 invoked from network); 20 May 2009 12:47:35 -0000 Received: from 75.143.234.216 (HELO ?192.168.1.104?) (75.143.234.216) by relay00.pair.com with SMTP; 20 May 2009 12:47:35 -0000 X-pair-Authenticated: 75.143.234.216 Message-Id: <4EAAB095-4CFD-4645-AAD5-4321B34121B6@apache.org> From: Damien Katz To: user@couchdb.apache.org In-Reply-To: <7fe7e0900905200503k52348822r84839b389ef785f3@mail.gmail.com> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v935.3) Subject: Re: couchdb kills itself Date: Wed, 20 May 2009 08:47:36 -0400 References: <45ae90370905191531x793b0606l8b9241fa061ff82c@mail.gmail.com> <20090519233143.GE8115@tumbolia.org> <45ae90370905191937w49d2d870j3e7846e380463097@mail.gmail.com> <7fe7e0900905200315r48a0a0b0xd29d3a99b10da321@mail.gmail.com> <7fe7e0900905200503k52348822r84839b389ef785f3@mail.gmail.com> X-Mailer: Apple Mail (2.935.3) X-Virus-Checked: Checked by ClamAV on apache.org Are there more errors in the log? This error only makes sense to me if something else is restarting, because of a configuration change or because something else must have crashed beforehand. For example, running the test suite restarts components during testing which could cause a crash like this. -Damien On May 20, 2009, at 8:03 AM, Tim Somers wrote: > On Wed, May 20, 2009 at 11:57 AM, Paul Davis >wrote: > >> On Wed, May 20, 2009 at 6:15 AM, Tim Somers >> wrote: >>> Hi, >>> >>> I'm getting the exact same error: >>> >>> [error] [<0.7002.0>] {error_report,<0.22.0>, >>> {<0.7002.0>,crash_report, >>> [[{pid,<0.7002.0>}, >>> {registered_name,[]}, >>> {error_info, >>> {exit, >>> {timeout, >>> {gen_server,call, >>> [couch_config, >>> >>> {register,#Fun,<0.7002.0>}]}}, >>> [{gen_server,call,2}, >>> {couch_httpd,handle_request,4}, >>> {mochiweb_http,headers,5}, >>> {proc_lib,init_p_do_apply,3}]}}, >>> >>> {initial_call,{mochiweb_socket_server,acceptor_loop, >>> ['Argument__1']}}, >>> {ancestors, >>> >> [couch_httpd,couch_secondary_services,couch_server_sup,<0.1.0>]}, >>> {messages,[]}, >>> {links,[<0.52.0>,#Port<0.4751>]}, >>> {dictionary,[]}, >>> {trap_exit,false}, >>> {status,running}, >>> {heap_size,2584}, >>> {stack_size,23}, >>> {reductions,1669}], >>> []]}} >>> [error] [<0.52.0>] {error_report,<0.22.0>, >>> {<0.52.0>,std_error, >>> {mochiweb_socket_server,235, >>> {child_error, >>> {timeout, >>> {gen_server,call, >>> [couch_config, >>> {register,#Fun, >>> <0.7002.0>}]}}}}}} >>> >>> =ERROR REPORT==== 20-May-2009::12:02:45 === >>> {mochiweb_socket_server,235, >>> {child_error, >>> {timeout, >>> {gen_server,call, >>> [couch_config, >>> {register,#Fun>> 9.104562741>,<0.7002.0>}]}}}} >>> >>> >>> >>> although it seems to happen when the system is overloaded. In >>> total, I >> have >>> 5 processing constantly reading from and writing to the same >>> couchdb, >> with a >>> resulting load average of about 3 and physical memory at it's >>> limit. I >> get >>> the impression (though it's hard to reproduce) that this error >>> come at >> the >>> moment the system is swapping some ram out to disk, making couchdb >>> run >> into >>> some timeout while calculating a view. >>> Couchdb does stay online though, only crashing my app with an >>> unusable >>> result. >>> >> >> Can you check if couchdb actually stays alive or if it's getting >> respawned by heart? The easiest way to test this is to run couchdb >> with the command line without the init.d script. >> >> Erlang closes the entire VM when it's unable to acquire memory. Ie, >> if >> malloc returns NULL, then the whole VM closes. The general idea being >> that it'll just rely on heart to be restarted. >> >> Paul Davis >> >>> I'm using svn version 776257 >>> >>> Tim >>> >> > > It stays alive, I always start it from command line. I use the svn > version > on port 5985, and the version installed by debian package manager on > port > 5984 for comparison. > > Tim