incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James Marca <>
Subject Re: how do I do different reduce operations on the same map
Date Thu, 12 Feb 2009 01:32:25 GMT
On Wed, Feb 11, 2009 at 08:08:39PM +0000, Brian Candler wrote:
> On Tue, Feb 10, 2009 at 02:31:58PM -0800, James Marca wrote:
> > I have a situation where I want to run two different reduce functions
> > on the output of a single map function.  Like suppose I want one
> > reduce function to get the count of all objects in each group (for
> > example, documents with or without attachments), and another reduce to
> > compute some other aggregate, like the average and standard deviation
> > of a value, (like the average size of attached documents).  (Yes, I
> > know this is a stupid example, as the averaging reduce function will
> > also have the count, but my real case is too complicated to write
> > easily).
> I believe reduce values are any JSON object, so perhaps you could reduce to
> an array of values, e.g. [count, total, sum_of_squares]
> The final calculation of average and SD could then be left to the client

I'll have to think about what that means.  I've got mean/sd, etc
handled in a reduce, but I was wondering about doing other things with
the same map.  I am analyzing data from detectors for a year, with
most of the detectors reporting every 30 seconds.  So I want to say
things like "the average, std. dev, min and max for X on Tuesday between
8:05 and 8:10 was [...]"  That's one map/reduce run.  But there might be
other things we want to look at, so I was wondering whether it was
worth it to optimize a single map now (given the size of the data)
rather than adding more maps later. 


This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

View raw message