incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brian Candler <B.Cand...@pobox.com>
Subject Re: 'Grouping' documents so that a set of documents is passed to the view function
Date Thu, 25 Jun 2009 09:08:39 GMT
On Thu, Jun 25, 2009 at 09:34:31AM +0100, Brian Candler wrote:
> Perhaps it will help you to understand this if you consider the limiting
> case where exactly one document is fed into the 'reduce' function at a time,
> and then the outputs of the reduce functions are combined with a large
> re-reduce phase.

Incidentally, this is a partly realistic scenario. It's quite possible given
N documents that couchdb will reduce the first N-1, then reduce the last 1,
then re-reduce those two values. This might be because of how the documents
are split between Btree nodes, or there may be a limit on the number of
documents passed to the reduce function in one go. This is entirely an
implementation issue which you have no control over, so you must write your
reduce/rereduce to give the same answer for *any* partitioning of documents.

More info at http://wiki.apache.org/couchdb/Introduction_to_CouchDB_views

"To make incremental Map/Reduce possible, the Reduce function has the
requirement that not only must it be referentially transparent, but it must
also be commutative and associative for the array value input, to be able
reduce on its own output and get the same answer, like this:

f(Key, Values) == f(Key, [ f(Key, Values) ] )"

Now, at first glance your re-reduce function appears to satisfy that
condition, so perhaps there should be another one: namely, that for any
partitioning of Values into subsets Values1, Values2, ... then

  f(Key, Values) == f(Key, [ f(Key,Values1), f(Key,Values2), ... ] )

But I am not a mathematician so I'm not sure if this condition is actually
stronger.

Regards,

Brian.

Mime
View raw message