couchdb-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Couchdb Wiki] Update of "Introduction to CouchDB views" by BrianCandler
Date Thu, 25 Jun 2009 09:42:46 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Couchdb Wiki" for change notification.

The following page has been changed by BrianCandler:
http://wiki.apache.org/couchdb/Introduction_to_CouchDB_views

The comment on the change is:
Inline exposition of reduce/rereduce

------------------------------------------------------------------------------
  
  Often, reduce functions can be written to handle rereduce calls without any extra code,
like the summation function above. In that case, the ''rereduce'' argument can be ignored
and in JavaScript, it can be omitted from the function definition entirely.
  
+ === Reduce vs rereduce ===
+ 
+ On a large database objects to be reduced will be sent to your reduce function in batches.
These batches will be broken up on B-tree boundaries, which may occur in arbitrary places.
+ 
+ [http://mail-archives.apache.org/mod_mbox/couchdb-user/200903.mbox/<20090330084727.GA7913@uk.tiscali.com>
For example], suppose you have a view which emits key->value pairs like this:
+ 
+ {{{
+ [X, Y, 0]  -> Object_A
+ [X, Y, 1]  -> Object_B1
+ [X, Y, 1]  -> Object_B1
+ [X, Y, 1]  -> Object_B1
+ [Z, Q, 0] ....
+ }}}
+ 
+ Your reduce function may receive
+ 
+ {{{
+    [Object_A, Object_B1]
+ }}}
+ 
+ and then in a separate invocation
+ 
+ {{{
+    [Object_B1, Object_B1]
+ }}}
+ 
+ The outputs of these two reduce functions will then be passed to your reduce function again
with rereduce=true to make the final answer. You cannot rely on all four rows being passed
to the initial reduce function.
+ 
+ Furthermore: due to reduce optimisations, you may only receive some of the blocks to be
reduced. Example: take these three Btree nodes:
+ 
+ {{{
+      [a b c d e f g] [h i j k l m n] [o p q r s t u]
+             R1              R2              R3
+ }}}
+ 
+ The reduce value of all the items in each Btree node is stored within each node, e.g. {{{[a
b c d e f g]}}} reduces to {{{R1}}}. Now suppose someone asks for a reduce value across a
key range:
+ 
+ {{{
+                       key range
+               <----------------------------->
+      [a b c d e f g] [h i j k l m n] [o p q r s t u]
+ }}}
+ 
+ CouchDB will call your reduce function to calculate a value for {{{[e f g]}}} and for {{{[o
p q r]}}}, but will use the existing stored/calculated value of R2 across the middle block.
+ 
+ Therefore, it is wrong to attempt to maintain any sort of state in your reduce function
between invocations. And because the Btree node boundaries can appear in any place, it is
wrong to attempt to cross-reference adjacent documents too. Any cross-referencing needs to
take place in the client, not in a reduce function.
+ 
  === Access Strategy ===
  
  For queries which are not meant to actually condense the amount of information you often
can live without a reduce function. A common strategy is to get the data you are interested
to select by in into the ''key'' part and then use ''startkey'' and ''endkey'' on the result.
@@ -240, +287 @@

   * [http://damienkatz.net/2008/02/incremental_map_1.html]
   * [http://horicky.blogspot.com/2008/10/couchdb-implementation.html]
  
- These mailing list posts may also be helpful:
-  * [http://mail-archives.apache.org/mod_mbox/couchdb-user/200903.mbox/<20090330084727.GA7913@uk.tiscali.com>]
-  * [http://mail-archives.apache.org/mod_mbox/couchdb-user/200906.mbox/<20090621202105.GA1937@uk.tiscali.com>]
- 

Mime
View raw message