Return-Path: Delivered-To: apmail-couchdb-dev-archive@www.apache.org Received: (qmail 39218 invoked from network); 5 May 2009 20:19:44 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 5 May 2009 20:19:44 -0000 Received: (qmail 20407 invoked by uid 500); 5 May 2009 20:19:43 -0000 Delivered-To: apmail-couchdb-dev-archive@couchdb.apache.org Received: (qmail 20330 invoked by uid 500); 5 May 2009 20:19:43 -0000 Mailing-List: contact dev-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@couchdb.apache.org Delivered-To: mailing list dev@couchdb.apache.org Received: (qmail 20320 invoked by uid 99); 5 May 2009 20:19:43 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 05 May 2009 20:19:43 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of jchris@gmail.com designates 74.125.46.30 as permitted sender) Received: from [74.125.46.30] (HELO yw-out-2324.google.com) (74.125.46.30) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 05 May 2009 20:19:33 +0000 Received: by yw-out-2324.google.com with SMTP id 2so2491420ywt.5 for ; Tue, 05 May 2009 13:19:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:sender:received:in-reply-to :references:date:x-google-sender-auth:message-id:subject:from:to:cc :content-type:content-transfer-encoding; bh=AC9zBX4owac+KMzDp7YrW9AfXCqPyhHTe94pjoMpp8I=; b=ZzLvj56EZCb3sikbRqIbj75VA2Ao9LS3LzdhWpE67TdHQ2+Y8v+j1Gm3EsAh2jeSbA ycELIjB5H8WItqSooguFVQ9BDlbwLdJ9u06GPV80CsJytJEwd/DT0tRucu16C7c6gr9Z xGUofzeEv0w0hx7BNRE6o5yuW96Lc3zzJ/S08= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; b=aVQmqHc0Hn64WIDeMWSSs6zmu0rfTgeSk3RX6s94SIc8Ql2K/ezDTvmGa+1B9NAOuZ nBX/c2l+Een4hv+lwk11SxhY9BDsLwJAnuhs5XjUMBBcScmTY71wYsM73FhhI4FGrUI9 TYuyv2MYqcMUrYGH/AS8q0V2BrWFAG+jQS6CU= MIME-Version: 1.0 Sender: jchris@gmail.com Received: by 10.100.211.20 with SMTP id j20mr951954ang.33.1241554751768; Tue, 05 May 2009 13:19:11 -0700 (PDT) In-Reply-To: <20090505195026.GA15177@uk.tiscali.com> References: <20090505195026.GA15177@uk.tiscali.com> Date: Tue, 5 May 2009 13:19:10 -0700 X-Google-Sender-Auth: 57546223aeb25bf3 Message-ID: Subject: Re: reduce_limit error From: Chris Anderson To: Brian Candler Cc: dev@couchdb.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org On Tue, May 5, 2009 at 12:50 PM, Brian Candler wrote: > On Mon, May 04, 2009 at 03:08:38PM -0700, Chris Anderson wrote: >> I'm checking in a patch that should cut down on the number of mailing >> list questions asking why a particular reduce function is hella slow. >> Essentially the patch throws an error if the reduce function return >> value is not at least half the size of the values array that was >> passed in. (The check is skipped if the size is below a fixed amount, >> 200 bytes for now). > > I think that 200 byte limit is too low, as I have now had to turn off the > reduce_limit on my server for this: > > RestClient::RequestFailed: 500 reduce_overflow_error (Reduce output must > shrink more rapidly. Current output: '[{"v4/24": 480,"v4/20": 10,"v4/26": > 10,"v4/19": 3,"v4/27": 23,"v4/18": 1,"v4/28": 32,"v4/32": 424,"v4/25": > 17,"v4/30": 28,"v4/22": 15,"v4/16": 200,"v4/29": 74,"v4/21": 1,"v4/14": > 41,"v4/12": 1,"v4/13": 1,"v4/17": 4,"v4/11": 1}]') > > I'd have thought a threshold of 4KB would be safe enough? > That looks an awful lot like a "wrong" kind of reduce function. Is there a reason why you don't just emit map keys like "v4/24" and use a normal row-counting reduce? It looks like this reduce would eventually overwhelm the interpreter, as your set of hash keys looks like it may grow without bounds as it encounters more data. Perhaps I'm wrong. 200 bytes is a bit small, but I'd be worried that with 4kb users wouldn't get a warning until they had moved a "bad" reduce to production data. If your reduce is ok even on giant data sets, maybe you can experiment with the minimum value in share/server/views.js line 52 that will allow you to proceed. -- Chris Anderson http://jchrisa.net http://couch.io