Return-Path: X-Original-To: apmail-couchdb-user-archive@www.apache.org Delivered-To: apmail-couchdb-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 6EFA375B3 for ; Thu, 11 Aug 2011 19:19:25 +0000 (UTC) Received: (qmail 49854 invoked by uid 500); 11 Aug 2011 19:19:23 -0000 Delivered-To: apmail-couchdb-user-archive@couchdb.apache.org Received: (qmail 49812 invoked by uid 500); 11 Aug 2011 19:19:22 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 49804 invoked by uid 99); 11 Aug 2011 19:19:22 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 11 Aug 2011 19:19:22 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of paul.joseph.davis@gmail.com designates 209.85.220.180 as permitted sender) Received: from [209.85.220.180] (HELO mail-vx0-f180.google.com) (209.85.220.180) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 11 Aug 2011 19:19:17 +0000 Received: by vxh15 with SMTP id 15so2943537vxh.11 for ; Thu, 11 Aug 2011 12:18:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; bh=/hQVGzkbQjDxt5UTX+1Wv6/in4vfSWTrIEmC5NV2qdQ=; b=qQHr++n/ei2ZUfjZTepf5l3twUopQtHH7DaQmfBYtI/zoEe9c/5aicbEmZCoptG0B0 jFrqSACijpQ5Rgkd582o/00B19v/8urCDEVPZ+80xeCyX+mDDiVPk0w+UtZz4kNGQ13w OotYwslENW6bLR4kPusi5J6GohEla6zJdS+As= Received: by 10.52.179.100 with SMTP id df4mr21145vdc.9.1313090337132; Thu, 11 Aug 2011 12:18:57 -0700 (PDT) MIME-Version: 1.0 Received: by 10.52.160.194 with HTTP; Thu, 11 Aug 2011 12:18:17 -0700 (PDT) In-Reply-To: References: <297F39FD-EA1B-44A0-B5C3-5633F6CE5988@thenoi.se> From: Paul Davis Date: Thu, 11 Aug 2011 14:18:17 -0500 Message-ID: Subject: Re: gen_server timeout To: user@couchdb.apache.org Content-Type: text/plain; charset=ISO-8859-1 Bug tracker is here: https://issues.apache.org/jira/browse/COUCHDB Knowing that its related to reduce overflows might be enough, but please open a ticket and paste any code you have in case I can't get it to reproduce right away. On Thu, Aug 11, 2011 at 1:52 PM, Michael wrote: > Definitely am able to reproduce this. One couchjs process is left for each > reduce_overflow_error I have. I have a map/reduce function that is causing > the reduce overflow error, if that is useful. Is this the best forum for > posting those files or is there a bug tracker? > > Thanks, > > Michael > > > On Thu, Aug 11, 2011 at 2:06 PM, Paul Davis wrote: > >> Michael, >> >> If you have this narrowed down to a specific map/reduce pair or even a >> design doc and some data it would be super helpful if you could find >> something reproducible. Everything I've heard about this is that we're >> losing track of couchjs processes which then ends up filling up the >> os_process_limit which would eventually lead to this. >> >> The easiest way to check is to just watch "ps ax | grep couchjs | wc >> -l" to see which HTTP calls lead to that number start increasing >> beyond whatever a single request requires. >> >> My gut feeling is that this is a weird error condition where there's >> an error in a reduce call or similar which manages to bypass a "return >> process" type of call. >> >> Let me know if you find anything. >> >> Thanks, >> Paul >> >> On Thu, Aug 11, 2011 at 12:47 PM, Michael wrote: >> > I just had the same thing happen last night after trying a bunch of >> > reductions on a large set of data. All of a sudden all of my views were >> > returning this error. >> > >> > This morning I came back to what I was working on and had the same >> problem, >> > all views were just returning this error. >> > >> > I restarted the couch service and everything seemed to be back on track. >> > Next time it happens I will certainly look for these processes. >> > >> > I am running 1.1, once I am done with my work today I will try the same >> > reductions and see if happens again. >> > >> > I just wanted to throw in that I saw this recently as well. >> > >> > Thanks, >> > >> > Michael >> > >> > On Thu, Aug 11, 2011 at 1:30 PM, Paul Davis > >wrote: >> > >> >> When this happens can you do a "ps ax | grep couchjs" on the machine >> >> hosting CouchDB? It sounds like you've hit the process limit (which is >> >> configurable). Hard to say if this is because you have lots of >> >> concurrent clients holding couchjs processes or if we're leaking them >> >> out of the pool somehow. If you can show that there aren't any clients >> >> holding them (ie, from view updates or long list calls) then I'd be >> >> super intrigued to see if you can narrow it down to a test case. I've >> >> heard a couple anecdotes about leakage here but never with enough >> >> detail to start looking for a root cause. >> >> >> >> On Thu, Aug 11, 2011 at 5:26 AM, Martin Hewitt >> wrote: >> >> > I've put a log extract on Pastebin here: http://pastebin.com/PuJm08J0 >> >> > >> >> > Sorry, I'm not familiar with erlang, so I'm not sure which bits are >> >> pertinent, and there may well be more than one error trace in there. >> >> > >> >> > Any help would be greatly appreciated. >> >> > >> >> > Martin >> >> > >> >> > On 11 Aug 2011, at 11:14, Martin Hewitt wrote: >> >> > >> >> >> Hi all, >> >> >> >> >> >> I'm getting the following error when trying to load some views: >> >> >> >> >> >> >> >> >> {"error":"timeout","reason":"{gen_server,call,[couch_query_servers,{get_proc,<<\"javascript\">>}]}"} >> >> >> >> >> >> Googling around, it seems this issue has been "fixed" way before I >> even >> >> started using CouchDB. Any ideas what could be causing it now? >> >> >> >> >> >> Thanks, >> >> >> >> >> >> Martin >> >> > >> >> > >> >> >> > >> >