couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jan Lehnardt <...@apache.org>
Subject Re: stray couchjs processes
Date Sat, 16 Apr 2011 14:54:48 GMT
Hah, thanks Adam,

this is exactly the email I hoped to see by CCing dev@ :)

Cheers
Jan
-- 

On 16 Apr 2011, at 15:09, Adam Kocoloski wrote:

> I've seen this bug in the wild.  I haven't been able to track down the exact root cause,
but the various ets tables in couch_query_servers get out of sync with one another - one table
will think there are no available processes and will cause new ones to be spawned but the
others will still have some record of the hundreds of spawned couchjs processes.
> 
> I rewrote the gen_server to use a single ets table and refactored a few other things
in COUCHDB-901[1].  It's missing a hard limit on the number of processes that we'll spawn,
and instead has a soft limit above which it will discard processes after their workload has
finished.  I'm overdue to finish that ticket off and get it into trunk.  Regards,
> 
> Adam
> 
> [1]: https://issues.apache.org/jira/browse/COUCHDB-901
> 
> On Apr 16, 2011, at 8:39 AM, Jan Lehnardt wrote:
> 
>> Hi Ning,
>> 
>> the correlation between couchjs and HTTP requests is that whenever a
>> request needs couchjs for anything, it will use one that is around and
>> idle. When CouchDB starts, none are idle and it will for and exec a 
>> new couchjs process. A couchjs process is not idle when a request is
>> using it. So for every concurrent request, you will get a new fork &
>> exec of a couchjs process.
>> 
>> I haven't looked at the current implementation in a while, but we
>> should look into implementing some configurable ceiling that can't
>> be crossed with more fork & exec. Requests then could either wait
>> until a couchjs is idle and eventually timeout if none get freed
>> or they could get served a Service Unavailable (503) — That behaviour
>> should also be configurable.
>> 
>> CCing dev@ to see if we can get more feedback on this.
>> 
>> Cheers
>> Jan
>> -- 
>> 
>> 
>> On 15 Apr 2011, at 20:16, Ning Tan wrote:
>> 
>>> A while back there was a post about stray couchjs processes that had
>>> no apparent resolution. A similar situation happened in our
>>> environment that resulted in hundreds of couchjs processes, which
>>> caused out-of-memory problems for the server.
>>> 
>>> We are investigating the cause and would appreciate any help in
>>> pinpointing the problem. One thing that was curious to me is, how many
>>> couchjs processes are needed to support concurrent requests. I
>>> couldn't reproduce a large number of couchjs processes in my local
>>> environment. It seems that all my view/filter requests were handled by
>>> just one couchjs process.
>>> 
>>> The environment that had problems was using 1.0.1. I've been testing
>>> locally with 1.0.2.  Would that make any difference?
>>> 
>>> Also, the problematic environment had proxies sitting in front of the
>>> couch boxes, so that's another variance in our analysis. But it's hard
>>> to tell without knowing the relationship/cardinality between an HTTP
>>> connection and a couchjs process. In the original post, connections
>>> not properly closed were hinted as a potential culprit. However, it's
>>> still unclear to me how mishandled HTTP connections can result in
>>> multiple couchjs processes. If I'm not mistaken, couchjs only talks
>>> via stdin/stdout and is not handling a connection directly.
>>> 
>>> Sorry if this question doesn't have enough information. We are still
>>> in very early stages of our analysis and don't have a lot of leads
>>> yet.
>>> 
>>> Thanks!
>> 
> 


Mime
View raw message