incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Glenn Rempe <gl...@rempe.us>
Subject Re: Help with complex key range query and map/reduce
Date Mon, 28 Sep 2009 20:51:09 GMT
Thank you Jeremy, Jesse, Adam, and Chris for the responses.  It is really
very much appreciated.
Chris, is your concern specifically with what Adam was proposing?  Would
Jesse's solution be more performant over time and in-line with how CDB wants
to operate?  Or do both of these pose a problem.

I currently have about 26 million records in a DB I am converting over to
CouchDB, and the reduce could potentially be performed on hundreds of
thousands of records for a query if the user wanted to generate a report
across say 90 days of data.

Jesse et. al, regarding your warning about how best to search across the
complex key.  Could you take a peek at a code snippet and let me know if I
am going astray in my currently in dev code?  Here is a link to a snippet
(Ruby code)

http://pastie.org/633956

So here is what I am trying to do:

The user wants to filter a search based on category_id, date range, and
engine.

- The category_id is always the same for the start and end key.
- The dates in the start and end key would almost always be a range
(specific start/end dates, or a backwards X # of days from today)
- They should be able to filter on engine (specifically g, y, or b, or
across all of them)

So are you saying that if I query like:

startkey = [1, 2009, 9, 1, "a"]
endkey = [1, 2009, 9, 28, "a\u9999"]

Would it not return all records between 9/1 and 9/28, across ALL engines?

Alternatively if I was specific with the engine, but still provided a date
range:

startkey = [1, 2009, 9, 1, "g"]
endkey = [1, 2009, 9, 28, "g"]

Would it not return category 1, across the entire date range, but limited to
those with the engine 'g' only?

Again, thanks very much for all your help with getting over the humps.

Glenn



On Mon, Sep 28, 2009 at 11:07 AM, Chris Anderson <jchris@apache.org> wrote:

> On Mon, Sep 28, 2009 at 10:31 AM, Adam Wolff <awolff@gmail.com> wrote:
> > If you change your map like this:       var value ={};
> >       value[domain] = 1;
> >       emit([key...], value);
> >
> > Then you don't need conditional handling of rereduce. You can just write:
> >           values.forEach(function(v) {
> >               for (var k in v) {
> >                   hist[k] = (hist[k] || 0) + v[k];
> >                }
> >            }
> >
>
> Just a warning that keeping a growing dictionary in your reduce values
> can be considered an anti-pattern. You'll get reduce overflows on big
> data sets.
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message