incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Van Pelt" <>
Subject Re: Slooooow views
Date Thu, 08 Jan 2009 00:31:30 GMT
I chose couch because I needed a way to take arbitrary hashes and
combine them, performing various operations on dynamic key/value
pairs.  Seeing that couch would eventually be able to do this in a
distributed manor seemed like a great fit.

My impression was that the reduce step was incremental once the
functions were defined...  Given the referential transparency of my
reduce function, I don't understand the performance impact incurred by
the large dynamic hash output from my reduce function.  Can you think
of a better fit for my needs in another solution?


On Wed, Jan 7, 2009 at 4:00 PM, Damien Katz <> wrote:
> In Couchdb, your reductions must compute to smallish, fixed sized data. The
> problem is your reduce function, it's builds up and returns a map of values,
> and as it computes the index, it will actually compute the reduction of
> every value in the view. Every time the index is updated, it does this.
> -Damien
> On Jan 7, 2009, at 6:38 PM, Chris Van Pelt wrote:
>> Ok, so I created a gist with the map, reduce, and a document:
>> The purpose of this view is to combine multiple judgments (the data
>> attribute of the doc) for a single unit_id.  The "fields
>> attribute tells couch how to aggregate the data (averaging numbers,
>> choosing the most common item, etc.).
>> I do use group=true, along with skip and count when querying this
>> view.  I understand that skip can slow things down, but the request is
>> still slow when skip is 0.
>> Another strange thing is that even when I query one of my "count"
>> views (a simple sum() reduce step) I experience the same lag.  Could
>> this be because my count views are a part of the same design document?
>> Also are there better ways to debug this?  I've set my log level to
>> debug, but it doesn't give me details about where the time spent
>> processing is going, and I can only gauge response times to the
>> second...
>> Chris
>> On Wed, Jan 7, 2009 at 3:12 PM, Chris Anderson <> wrote:
>>> On Wed, Jan 7, 2009 at 3:07 PM, Jeremy Wall <> wrote:
>>>> Maybe someone else could chime in on when you get the hit for reduction?
>>> Based on my use of log() in the reduce function, it looks like for
>>> each reduce query, the reduce function is run once, to obtain the
>>> final reduce value.
>>> When you run a group=true, or group_level reduce query, which returns
>>> values for many keys, you'll end up running the final reduction once
>>> per returned value. I think this could be optimized to avoid running
>>> final reduces if they've already been run for those key-ranges. I'm
>>> not sure how much work that would be.
>>> --
>>> Chris Anderson

View raw message