couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brian Candler <B.Cand...@pobox.com>
Subject Re: Problems with reduce in view appear when record size > 6
Date Thu, 30 Jul 2009 07:43:03 GMT
On Wed, Jul 29, 2009 at 11:48:59PM -0400, Jochen Kempf wrote:
>    guessing that you refer to this page [1]incremental map

No, I meant this one.
http://wiki.apache.org/couchdb/Introduction_to_CouchDB_views

"Reduce functions must accept, as input, results emitted by its
corresponding map function *as well as results returned by the reduce
function itself*. The latter case is referred to as a rereduce"

It then goes on to describe the two cases.

>    map =>
>    "
>      function(doc) {
>      emit(doc["_id"], [doc["_id"], doc["_rev"], doc["var1"], doc["var2"],
>    doc["var3"], doc["var4"], doc["var5"]]);
>      }
>    "
>    reduce =>
>    "
>      function(key, values, combine) {
>            var result = {ids:[], revs:[], variables:[]}
>              if (combine) {
>                for (i in values) {
>                  result.ids.push(values[i].ids);
>                  result.revs.push(values[i].revs);
>                  result.variables.push(values[i].variables);
>                }
>              } else {
>                for (i in values) {
>                  result.ids.push(values[i][0]);
>                  result.revs.push(values[i][1]);
>                  result.variables.push([values[i][2], values[i][3],
>    values[i][4], values[i][5], values[i][6]]);
>                }
>              }
>            return result;
>          }
>    "

I think you want concat() rather than push() in the combine section.

Otherwise, that looks like a working but extremely bad reduce function. Once
your database goes above a certain size it will trigger a limit error in
CouchDB; you can disable that error, but then you will suffer very poor
performance as your database gets bigger.

The problem is that your reduce value doesn't "reduce" the size of your
output at all; the size of the reduce value will increase linearly with the
size of the database. CouchDB stores the reduce value across the documents
in a Btree node and its children within the Btree node. This means the root
Btree node stores the reduce value across the entire database.

This is very good for calculating reduce values quickly, but very bad if
your reduce value becomes huge, as yours will, because it will become slower
and slower to insert documents.

See "Reduced Value Sizes" in the Wiki page linked to above.

Basically this means you're doing it wrong. This sort of computation should
be done in the client, not the database. If you really want to do it in the
database, do it in a _list view. (This will still end up fetching and
serializing all the documents in the database or the key range in question,
but at least won't send them over the wire)

Regards,

Brian.

Mime
View raw message