incubator-couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From J Chris Anderson <jch...@gmail.com>
Subject Re: Hitting the reduce overflow boundary
Date Fri, 05 Mar 2010 16:58:02 GMT

On Mar 5, 2010, at 8:23 AM, Dirkjan Ochtman wrote:

> On Fri, Mar 5, 2010 at 12:17, Dirkjan Ochtman <djc.ochtman@gmail.com> wrote:
>> I would really like to have someone from the dev team speak up on this
>> one, since I'd kind of like to re-enable the reduce_limit option, but
>> I don't think this view should be classified as overflowing.
> 
> I happily found Adam in IRC, who explained this to me:
> 
> 17:05 <+kocolosk> djc: so the current reduce_limit calculation is in main.js
> 17:06 <+kocolosk> the JSONified reduction needs to be less than 200 bytes, and
>                  it needs to be less than half of the size of the input map
>                  values
> 17:06 <+kocolosk> you could try tweaking those to see which condition you're
>                  failing
> 
> The way I see it, the way the reduce phase should work is that the
> result from an collection of documents should be smaller or not much
> larger than the largest single object in the input set. This way,
> you'll prevent the unbounded growth that you want to prevent. Such a
> rule should also work on slightly larger inputs, because that should
> just be a larger constant, not exponentional growth.
> 
> So I see two problems with the current rule:
> 
> - it has a fixed limit at 200b, which isn't very reasonable because a
> larger size doesn't mean there's unbounded growth going on
> - it assumes that all the values in the input map have about equal
> size, which isn't really a requirement
> 
> Am I crazy, or would a scheme like I proposed above be an improvement?

definitely. A patch to make the reduce_overflow_threshold configurable (with a default of
200 bytes) would be a major improvement and not hard to do.

Chris

> 
> Cheers,
> 
> Dirkjan


Mime
View raw message