From user-return-6253-apmail-couchdb-user-archive=couchdb.apache.org@couchdb.apache.org Wed Sep 02 12:24:19 2009 Return-Path: Delivered-To: apmail-couchdb-user-archive@www.apache.org Received: (qmail 46866 invoked from network); 2 Sep 2009 12:24:19 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 2 Sep 2009 12:24:19 -0000 Received: (qmail 43380 invoked by uid 500); 2 Sep 2009 12:24:18 -0000 Delivered-To: apmail-couchdb-user-archive@couchdb.apache.org Received: (qmail 43292 invoked by uid 500); 2 Sep 2009 12:24:18 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 43282 invoked by uid 99); 2 Sep 2009 12:24:18 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 02 Sep 2009 12:24:18 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=HTML_MESSAGE X-Spam-Check-By: apache.org Received-SPF: unknown (nike.apache.org: error in processing during lookup of john@interactivemediums.com) Received: from [209.85.211.194] (HELO mail-yw0-f194.google.com) (209.85.211.194) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 02 Sep 2009 12:24:08 +0000 Received: by ywh32 with SMTP id 32so1162722ywh.11 for ; Wed, 02 Sep 2009 05:23:47 -0700 (PDT) MIME-Version: 1.0 Received: by 10.150.248.3 with SMTP id v3mr12526347ybh.289.1251894227110; Wed, 02 Sep 2009 05:23:47 -0700 (PDT) In-Reply-To: References: From: John Wood Date: Wed, 2 Sep 2009 07:23:27 -0500 Message-ID: Subject: Re: CouchDB pegging the CPU and not responding to requests To: user@couchdb.apache.org Content-Type: multipart/alternative; boundary=000e0cd69526b40a6b0472975611 X-Virus-Checked: Checked by ClamAV on apache.org --000e0cd69526b40a6b0472975611 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable In my case they are separate machines. On Tue, Sep 1, 2009 at 10:26 PM, Benoit Chesneau wrote= : > On Tue, Sep 1, 2009 at 4:52 PM, John Wood > wrote: > > Hi everybody, > > > > I'm currently facing an issue with our production installation of > CouchDB. > > Two times within the past 5 days, the Erlang process running CouchDB pe= gs > > one of the 4 cores on the machine, consumes about 40% of the system RAM > > (which is 4GB), and becomes completely unresponsive to incoming HTTP > > requests. The only way we can get it back to normal is to restart > CouchDB. > > > > I'm trying to determine what may be causing this, but I'm not having mu= ch > > luck. Nothing stands out in the CouchDB log files. I can see that the= re > > are no entries in the log files from the time it goes unresponsive unti= l > the > > time I restart it. Besides that, there doesn't appear to be any errors > > leading up to the issue. There are however a few errors like the one > below, > > but none right before CouchDB goes unresponsive: > > > > [error] [<0.11738.288>] {error_report,<0.21.0>, > > {<0.11738.288>,crash_report, > > [[{pid,<0.11738.288>}, > > {registered_name,[]}, > > {error_info, > > {error, > > {case_clause,{error,enotconn}}, > > [{mochiweb_request,get,2}, > > {couch_httpd,handle_request,4}, > > {mochiweb_http,headers,5}, > > {proc_lib,init_p,5}]}}, > > {initial_call, > > {mochiweb_socket_server,acceptor_loop, > > [{<0.56.0>,#Port<0.148>,#Fun}]}= }, > > {ancestors, > > [couch_httpd,couch_secondary_services,couch_server_sup, > > <0.1.0>]}, > > {messages,[]}, > > {links,[<0.56.0>,#Port<0.5032425>]}, > > {dictionary,[{mochiweb_request_qs,[{"limit","0"}]}]}, > > {trap_exit,false}, > > {status,running}, > > {heap_size,28657}, > > {stack_size,23}, > > {reductions,14034}], > > []]}} > > [error] [<0.56.0>] {error_report,<0.21.0>, > > {<0.56.0>,std_error, > > {mochiweb_socket_server,235, > > {child_error,{case_clause,{error,enotconn}}}}}} > > > > =3DERROR REPORT=3D=3D=3D=3D 30-Aug-2009::04:29:07 =3D=3D=3D > > {mochiweb_socket_server,235, > > {child_error,{case_clause,{error,enotconn}}}} > > > > I checked some of the other system log files (/var/log/messages, etc), > and > > there doesn't appear to be any information there either. > > > > Our CouchDB installation is fairly large. We have 7 production > databases, > > totaling almost 250GB. The largest database is 129GB. We are running > > CouchDB 0.9.0 on Red Hat Enterprise Server 5.3. As far as usage goes, = we > > are constantly inserting documents into the database (5,000 at a time v= ia > a > > bulk insert), and pausing to regenerate the views after 100,000 documen= ts > > have been inserted. Besides for the process that does the inserts, all > > views are accessed using stale=3Dok. > > > > Has anybody else faced a similar issue? Can anybody suggest tips > regarding > > how I should go about diagnosing this issue? > > > > Thanks, > > John > > > > -- > > John Wood > > Interactive Mediums > > john@interactivemediums.com > > > > does this happend when clients (couchrest) and server (couchdb) are > used on the same machine ? > > - beno=EEt > --=20 John Wood Interactive Mediums john@interactivemediums.com --000e0cd69526b40a6b0472975611--