incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Davis <paul.joseph.da...@gmail.com>
Subject Re: reduce order vs descending
Date Wed, 23 Feb 2011 21:59:24 GMT
2011/2/23 Ɓukasz Mielicki <mielicki@gmail.com>:
> On Wed, Feb 23, 2011 at 22:03, Paul Davis <paul.joseph.davis@gmail.com> wrote:
>
>>> For: reduce (null, [key_range_A1_A2, key_range_B1_B2, key_range_C1_C2], true)
>>> Does the relation A1 < A2 < B1 < B2 < C1 < C2 hold? Does it depend
on
>>> descending?
>>> For: reduce ([k1,k2,k3], [v1,v2,v3], false)
>>> Can I depend on k1 < k2 < k3 if descending=true?
>
>> These are specifically the assertions I'm saying you can't rely on. In
>> your case you will absolutely need to sort the input data.
>
>> The only assumption I think its possible to make is that re-reductions
>> will always process reductions from contiguous parts of the b+tree.
>> I'm only at the *think* stage of certainty. I haven't gone through the
>> implementation to check if that's an absolute truth, and I can't yet
>> guarantee that it won't ever change if it is true.
>>
>> Just in case that doesn't make sense, imagine you have a set of
>> reductions that are from ordered regions of the view b+tree labeled as
>> such: A, B, C, D. That is to say, data in the reduction A is based
>> entirely of keys <= reduction keys in B <= reduction keys in C <=
>> reduction keys in D. Given that, the only assumption I think is
>> possible is that you are guaranteed that if A and C are passed
>> simultaneously to a re-reduce, then B is guaranteed to be passed as
>> well. In other words, you wont get a reduction graph that looks like
>> ((A | C) | (B | D)) which would break this sort of approach.
>
> Thanks for clarifying that.
> The latter assumption was so inherent to me I didn't even consider it.
> Don't think there would be much use of such reduce results at query
> time.
>
> However form bird's eye view basing on that one I cannot see a reason
> not to preserve the first two (in terms of computational complexity
> even with reduce parallelized). It seems to be only a matter of
> preserving original b-tree partitioning in parameter passing.
> Providing such guarantees would be beneficial in two ways:
> - building a view would not depend on query parameters (sane)
> - left/right-reduce semantics available (without sorting)
>
> What you think?

I don't think any of those things should be guaranteed because that
means we won't be able to change them at some point in the future if
the need arises and I can't predict what changes we might need to make
for various situations.

I'm also suddenly wondering how this would behave when run on a
cluster with the reductions.

Mime
View raw message