Return-Path: Delivered-To: apmail-couchdb-user-archive@www.apache.org Received: (qmail 59463 invoked from network); 16 Apr 2011 12:39:44 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 16 Apr 2011 12:39:44 -0000 Received: (qmail 11139 invoked by uid 500); 16 Apr 2011 12:39:41 -0000 Delivered-To: apmail-couchdb-user-archive@couchdb.apache.org Received: (qmail 10945 invoked by uid 500); 16 Apr 2011 12:39:40 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 10930 invoked by uid 99); 16 Apr 2011 12:39:40 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 16 Apr 2011 12:39:40 +0000 X-ASF-Spam-Status: No, hits=0.7 required=5.0 tests=SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [80.244.253.218] (HELO mail.traeumt.net) (80.244.253.218) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 16 Apr 2011 12:39:31 +0000 Received: from [192.168.178.25] (brln-4d0cc975.pool.mediaWays.net [77.12.201.117]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by mail.traeumt.net (Postfix) with ESMTPSA id 302EB3C162; Sat, 16 Apr 2011 14:39:11 +0200 (CEST) Subject: Re: stray couchjs processes Mime-Version: 1.0 (Apple Message framework v1084) Content-Type: text/plain; charset=windows-1252 From: Jan Lehnardt In-Reply-To: Date: Sat, 16 Apr 2011 14:39:10 +0200 Cc: dev@couchdb.apache.org Content-Transfer-Encoding: quoted-printable Message-Id: References: To: user@couchdb.apache.org X-Mailer: Apple Mail (2.1084) X-Virus-Checked: Checked by ClamAV on apache.org Hi Ning, the correlation between couchjs and HTTP requests is that whenever a request needs couchjs for anything, it will use one that is around and idle. When CouchDB starts, none are idle and it will for and exec a=20 new couchjs process. A couchjs process is not idle when a request is using it. So for every concurrent request, you will get a new fork & exec of a couchjs process. I haven't looked at the current implementation in a while, but we should look into implementing some configurable ceiling that can't be crossed with more fork & exec. Requests then could either wait until a couchjs is idle and eventually timeout if none get freed or they could get served a Service Unavailable (503) =97 That behaviour should also be configurable. CCing dev@ to see if we can get more feedback on this. Cheers Jan --=20 On 15 Apr 2011, at 20:16, Ning Tan wrote: > A while back there was a post about stray couchjs processes that had > no apparent resolution. A similar situation happened in our > environment that resulted in hundreds of couchjs processes, which > caused out-of-memory problems for the server. >=20 > We are investigating the cause and would appreciate any help in > pinpointing the problem. One thing that was curious to me is, how many > couchjs processes are needed to support concurrent requests. I > couldn't reproduce a large number of couchjs processes in my local > environment. It seems that all my view/filter requests were handled by > just one couchjs process. >=20 > The environment that had problems was using 1.0.1. I've been testing > locally with 1.0.2. Would that make any difference? >=20 > Also, the problematic environment had proxies sitting in front of the > couch boxes, so that's another variance in our analysis. But it's hard > to tell without knowing the relationship/cardinality between an HTTP > connection and a couchjs process. In the original post, connections > not properly closed were hinted as a potential culprit. However, it's > still unclear to me how mishandled HTTP connections can result in > multiple couchjs processes. If I'm not mistaken, couchjs only talks > via stdin/stdout and is not handling a connection directly. >=20 > Sorry if this question doesn't have enough information. We are still > in very early stages of our analysis and don't have a lot of leads > yet. >=20 > Thanks!