incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jesse Hallett <>
Subject Re: Help with complex key range query and map/reduce
Date Tue, 29 Sep 2009 16:17:42 GMT

I am afraid that to do what you want to do you will need at least two
different views.  As Ning explained, the most significant position of the
key determines what will be filtered.  For example:

    start: [1, 2009, 9, 1, "g"]
    end:  [1, 2009, 9, 29, "g"]

Will return all results for September with cat_id 1, regardless of engine,
except that it will exclude items from 9/1 with engines "a"-"f" and items
from 9/29 with engines "h"-"z".  To demonstrate this, consider these
ordering facts:

    [1, 2009, 9, 5, "r"] > [1, 2009, 9, 1, "g"]  // 5 > 1
    [1, 2009, 9, 23, "m"] < [1, 2009, 9, 29, "g"]  // 23 < 29

That is why it is necessary to move the engine parameter in front of the
date.  If you want the options of filtering by a specific engine or
including all engines in a given date range you will need two different
views that have keys that are ordered differently.  If you want a specific
range of engines you will have to lock that query to a specific date
parameter.  But you can for example create a view that only emits year and
month values for the date to query based on a specific range of engines over
one month.

As for the increasing size of reduce function output; I think whether this
is a problem depends on the ratio of domains to documents that you are
querying across.  The volume of data output by the reduce function must be
less than the log of the volume of data emitted by the map function for the
same key range.  So if you have a small number of domains compared to
documents you are likely to be ok.

On Sep 28, 2009 3:08 PM, "Adam Wolff" <> wrote:

On Mon, Sep 28, 2009 at 1:51 PM, Glenn Rempe <> wrote: > Thank
you Jeremy, Jesse, Ada...
The problem is with the use-case, not the data representation (so "both.")
More about reduce limitations here:

One workaround here is to limit the number of values that are emitted in a
given reduce step to, say, the top n. This causes data-loss problems, but is
adequate for some applications.


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message