couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Davis <paul.joseph.da...@gmail.com>
Subject Re: Ordering of keys into reduce function
Date Wed, 13 May 2009 14:08:05 GMT
On Wed, May 13, 2009 at 7:50 AM, Brian Candler <B.Candler@pobox.com> wrote:
> Hi,
>
> I want to write a reduce function which, when reducing over a range of keys,
> gives the minimum and maximum *key* found in that range. (*)
>
> This could be done very easily and efficiently if I could rely on the
> following two properties:
>
> (1) keys/values passed into a first reduce are in increasing order of key
>

Yes, but with caveats. I couldn't track it down but I was getting a
weird last reduce that had keys of increasing order, but spanning the
entire range of possible keys.

> (2) reduced values passed into a re-reduce are for increasing key ranges
>

If you mean in a single call to re-reduce, then yes. If you mean, is
each re-reduce processed in order from left to right through the btree
then no. Or at least, probably not as you expect.

> The question is, can I rely on both of these properties? Especially in the
> re-reduce case?
>
> If I could, then to calculate the min I only need to take the first key
> passed, or the min from the first reduce value. Similarly for max, I'd take
> the last key, or the max from the last reduce value.
>
> Regards,
>
> Brian.
>
> (*) An alternative would be to do two queries: startkey=aaa&limit=1 and
> endkey=bbb&limit=1&descending=true. I would like to avoid two queries, and
> I'd also like this functionality for group_level=n, such that within each
> group I know the minimum and maximum key.
>

You mean the minimum and maximum value?

I'm pretty sure (but I haven't had coffee yet) that something like
this would work:

//map
function(doc) {emit(doc.key_val, doc.range_val);}

//reduce
function(keys, values, rereduce)
{
    var find_min_in_values = function(vals, rered) ....

    var min = find_min_in_values(values, rereduce);
    var max = find_max_in_values(values, rereduce);

    return {"min": min, "max": max};
}

Note that you can only query for the min/max in one range at a time
with this though.

Mime
View raw message