incubator-couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Anderson <jch...@apache.org>
Subject Re: reduce_limit error
Date Tue, 05 May 2009 20:19:10 GMT
On Tue, May 5, 2009 at 12:50 PM, Brian Candler <B.Candler@pobox.com> wrote:
> On Mon, May 04, 2009 at 03:08:38PM -0700, Chris Anderson wrote:
>> I'm checking in a patch that should cut down on the number of mailing
>> list questions asking why a particular reduce function is hella slow.
>> Essentially the patch throws an error if the reduce function return
>> value is not at least half the size of the values array that was
>> passed in. (The check is skipped if the size is below a fixed amount,
>> 200 bytes for now).
>
> I think that 200 byte limit is too low, as I have now had to turn off the
> reduce_limit on my server for this:
>
> RestClient::RequestFailed: 500 reduce_overflow_error (Reduce output must
> shrink more rapidly. Current output: '[{"v4/24": 480,"v4/20": 10,"v4/26":
> 10,"v4/19": 3,"v4/27": 23,"v4/18": 1,"v4/28": 32,"v4/32": 424,"v4/25":
> 17,"v4/30": 28,"v4/22": 15,"v4/16": 200,"v4/29": 74,"v4/21": 1,"v4/14":
> 41,"v4/12": 1,"v4/13": 1,"v4/17": 4,"v4/11": 1}]')
>
> I'd have thought a threshold of 4KB would be safe enough?
>

That looks an awful lot like a "wrong" kind of reduce function. Is
there a reason why you don't just emit map keys like "v4/24" and use a
normal row-counting reduce? It looks like this reduce would eventually
overwhelm the interpreter, as your set of hash keys looks like it may
grow without bounds as it encounters more data.

Perhaps I'm wrong. 200 bytes is a bit small, but I'd be worried that
with 4kb users wouldn't get a warning until they had moved a "bad"
reduce to production data.

If your reduce is ok even on giant data sets, maybe you can experiment
with the minimum value in share/server/views.js line 52 that will
allow you to proceed.

-- 
Chris Anderson
http://jchrisa.net
http://couch.io

Mime
View raw message