couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Wout Mertens <wmert...@cisco.com>
Subject Re: [user] Obtaining unique values from a view
Date Thu, 05 Mar 2009 15:59:18 GMT
On Mar 4, 2009, at 3:28 AM, Chris Anderson wrote:

> On Tue, Mar 3, 2009 at 1:32 PM, Wout Mertens <wmertens@cisco.com>  
> wrote:
>> Would the problem be alleviated if you could specify for views that  
>> couch
>> should not reduce past the group level? In other words, only  
>> calculate
>> what's needed for views with group=true?
>>
>
> Sort of. Essentially this would require an entirely different
> map/reduce implementation. It would probably only provide reductions
> at the group level (like Hadoop reduce). CouchDB is open to /
> interested in alternate view engines, and something like this could
> probably be created in a not-to-overwhelming amount of Erlang, on top
> of CouchDB's btree storage engine. Patches welcome! (Also, there are
> some patches floating around - once 0.9.0 is off our plate we'll
> probably have more spare cycles available for evaluating/consolidating
> them.)

Actually, I just did some tests around this, and it turns out that if  
you always query with group=true, CouchDB never runs the final reduce!

I tested it by making a temporary view in Futon:

map:    function(doc) { emit(doc._rev%10,doc._id); }
reduce: function(k,v,r) { if(r) { log(["rereduce",v]); } else  
{ log(["reduce",v]); } return(v); }

Just interacting with the view in Futon doesn't run rereduce on all  
view keys. Once you access the view directly, without the group=true  
parameter, CouchDB calculates the (re)reduce. Actually I didn't  
realize, but for small databases, it never calls rereduce. That makes  
sense now.

So as long as you promise never to run a particular "wide view"  
without the group=true parameter, and the "wideness" of your view  
results is manageable, it looks like you should be ok.

Of course, some attacker could DoS your server by calling the view  
without group=true :-/

Let's say I'm 70% certain of the above being true. I think I'm still  
missing some subtleties in map/reduce. Any opinions?

Anyway, CouchDB rocks :-)

Wout.

Mime
View raw message