couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brian Candler <>
Subject Re: Reduce Assumptions
Date Mon, 06 Apr 2009 08:51:00 GMT
On Fri, Apr 03, 2009 at 07:55:45PM -0700, Adam Wolff wrote:
> Thanks for this clear response. A related question: given a view like this:
> map: function(doc){
>     emit(doc.refId,;
> },
> reduce :  function(keys, values, rereduce){
>     return values.join("");
> }

A reduce function of the form values.join("") is not good. At the bottom of
you can see:

"reduce functions should not grow its output larger than log(n) where n is
the number of input rows"


"the Reduce function has the requirement that not only must it be
referentially transparent, but it must also be commutative and associative
for the array value input"

The way I read that: for any ordering of input keys, the output value of
your reduce function must be the same.

Your function doesn't have this property, so you cannot use it.

Note that

   reduce(M0, M1, M2, M3)


   rereduce(reduce(M0, M1), reduce(M2, M3))


   rereduce(reduce(M2, M3), reduce(M0, M1))

must all evaluate to be the same. Sorting the values in your (re)reduce
function might help except that you still wouldn't meet the log(n)

Basically - this is not how reduce functions are intended to be used. They
must somehow summarise the data, not aggregate it.

What you need, I believe, is a simple map:

  // MAP
  function(doc) {
    emit(doc.refId, null);

Then the client can query giving a startkey and endkey, and get back a list
of documents which contain that reference or range of references.

You can add a reduce function like this:

  function(ks, vs, co) {
    if (co) {
      return sum(vs);
    } else {
      return vs.length;

Then querying with group=true will give you the refIds mapped to a count of
documents containing that refId. Querying with reduce=false will give you
the same as the map function by itself.



View raw message