couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Davis <paul.joseph.da...@gmail.com>
Subject Re: gen_server timeout
Date Thu, 11 Aug 2011 18:06:45 GMT
Michael,

If you have this narrowed down to a specific map/reduce pair or even a
design doc and some data it would be super helpful if you could find
something reproducible. Everything I've heard about this is that we're
losing track of couchjs processes which then ends up filling up the
os_process_limit which would eventually lead to this.

The easiest way to check is to just watch "ps ax | grep couchjs | wc
-l" to see which HTTP calls lead to that number start increasing
beyond whatever a single request requires.

My gut feeling is that this is a weird error condition where there's
an error in a reduce call or similar which manages to bypass a "return
process" type of call.

Let me know if you find anything.

Thanks,
Paul

On Thu, Aug 11, 2011 at 12:47 PM, Michael <newmaniese@gmail.com> wrote:
> I just had the same thing happen last night after trying a bunch of
> reductions on a large set of data. All of a sudden all of my views were
> returning this error.
>
> This morning I came back to what I was working on and had the same problem,
> all views were just returning this error.
>
> I restarted the couch service and everything seemed to be back on track.
> Next time it happens I will certainly look for these processes.
>
> I am running 1.1, once I am done with my work today I will try the same
> reductions and see if happens again.
>
> I just wanted to throw in that I saw this recently as well.
>
> Thanks,
>
> Michael
>
> On Thu, Aug 11, 2011 at 1:30 PM, Paul Davis <paul.joseph.davis@gmail.com>wrote:
>
>> When this happens can you do a "ps ax | grep couchjs" on the machine
>> hosting CouchDB? It sounds like you've hit the process limit (which is
>> configurable). Hard to say if this is because you have lots of
>> concurrent clients holding couchjs processes or if we're leaking them
>> out of the pool somehow. If you can show that there aren't any clients
>> holding them (ie, from view updates or long list calls) then I'd be
>> super intrigued to see if you can narrow it down to a test case. I've
>> heard a couple anecdotes about leakage here but never with enough
>> detail to start looking for a root cause.
>>
>> On Thu, Aug 11, 2011 at 5:26 AM, Martin Hewitt <martin@thenoi.se> wrote:
>> > I've put a log extract on Pastebin here: http://pastebin.com/PuJm08J0
>> >
>> > Sorry, I'm not familiar with erlang, so I'm not sure which bits are
>> pertinent, and there may well be more than one error trace in there.
>> >
>> > Any help would be greatly appreciated.
>> >
>> > Martin
>> >
>> > On 11 Aug 2011, at 11:14, Martin Hewitt wrote:
>> >
>> >> Hi all,
>> >>
>> >> I'm getting the following error when trying to load some views:
>> >>
>> >>
>> {"error":"timeout","reason":"{gen_server,call,[couch_query_servers,{get_proc,<<\"javascript\">>}]}"}
>> >>
>> >> Googling around, it seems this issue has been "fixed" way before I even
>> started using CouchDB. Any ideas what could be causing it now?
>> >>
>> >> Thanks,
>> >>
>> >> Martin
>> >
>> >
>>
>

Mime
View raw message