couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Anderson" <jch...@apache.org>
Subject Re: indexes and a waste of a good map reduce
Date Fri, 31 Oct 2008 15:07:07 GMT
On Thu, Oct 30, 2008 at 6:35 PM, Ben Nevile <ben@mainsocial.com> wrote:
>
> So let's say we've implemented a word count.  We have a nice view that is
> indexed by word, so we can query any word and find out how many times it
> appears.  But if we want to know what words are the most frequent, seems to
> me that currently we're out of luck.

Ben,

You're right that sorting on reduce values is not part of the current
feature set. Here is how I've done that work in the past:

First, define just a map view, such that you'd like the reduce to be
performed on rows which have the same key. Then use the key_reduce
function from this code (or write your own)
http://github.com/jchris/couchrest/tree/master/lib/couchrest/helper/pager.rb

The idea is that this code pages through the view, yielding each key
and all of the values that are associated with it. You could do
whatever you like with this data. I define a "reduce" function in
ruby, and save it's output as documents in another database. Eg if
you're data is in my-db, then key_reduce into my-db-reduce.

Then you can define another set of map (and/or reduce) views on
my-db-reduce, which will sort the keys by a reduce value.

There are some missing features here. Chiefly this code is not all
that documented, and I'm certain parts of it could use more convention
and less ad-hoc decision making on the part of the user. But the
really big feature here would be incremental reduce. It's just a
matter of bookkeeping really, but it's not yet implemented. Perhaps
next time I have a novel use for key_reduce I'll get it working
incrementally.

Hope this helps.


-- 
Chris Anderson
http://jchris.mfdz.com

Mime
View raw message