incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mehdi El Fadil <>
Subject Re: Post-filtering reduced results?
Date Mon, 19 Sep 2011 14:42:15 GMT
Hi Calle,

At least you could move the post-processing to server side using a list

A better option for performance is to do the filtering inside the reduce
function. Try to look at this snippet, looks close to what you are trying to

Good luck,


On Mon, Sep 19, 2011 at 1:27 PM, Calle Dybedahl <>wrote:

> Hello.
> I have a pretty simple pair of map and reduce functions. The first is
> basically just emitting a key and a 1, and the reduce is the built-in _sum
> function. This works fine, and tells me how many times every key has been
> seen.
> Now, the problem is that I'm actually only interested in the handful of
> keys that have been seen the most often. The data fits a power-law
> distribution, which means that there is a long tail that I'm not at all
> interested in. And by "long" here I'm talking about tens of thousands of
> rows. At the moment, my client-side code spends more than 99.9% of its
> runtime receiving and parsing JSON from the CouchDB server, very nearly all
> of which it will promptly throw away as soon as it's been parsed. This is
> annoying and silly.
> Is there any way at all to filter the results of a reduced query on the
> CouchDB end? Alternatively, is there a way for a reduce function to know
> that it's the final stage in the re-reduce chain (if I could drop all keys
> with a final value of 1, I'd save an order of magnitude of runtime)?
> I can't be the first one ever to run into a problem like this, but I've
> failed to find any solutions on the net.
> --
> Calle Dybedahl
> -*- +46 703 - 970 612

Mehdi El Fadil
twitter: @mango_info <>

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message