incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kropp, Henning" <>
Subject Re: Multiple map reduce stages
Date Thu, 20 May 2010 09:01:58 GMT
Am 20.05.2010 10:48, schrieb Kropp, Henning:
> Am 18.05.2010 20:16, schrieb J Chris Anderson:
>> On May 18, 2010, at 2:52 AM, Kropp, Henning wrote:
>>> Hi,
>>> as far as I know working with map reduce commonly involves multiple map
>>> and reduce stages. A view in couchdb solely consists of one map and if
>>> necessary one reduce stage!? To have multiple map and reduce stages one
>>> would have to conjunct views in couchdb!? How can I do that? Is it
>>> possible to give the function(doc){..} another parameter? There is the
>>> shows which have the extra parameter req for the http request.
>>> Unfortunately my javascript knowledge of the underlaying Prototype
>>> concept is not very funded which could be helpful here?
>>> Kind regards and many thanks in advanced
>> CouchDB Map Reduce is a realtime incremental model, so it is quite different from
the Hadoop-style batch model. Of course you can still chain map reduce by copying the rows
from a view query to a new db, and writing another view on the new db.
>> Chris
> That is interesting to know. Hive adopts the batch model but obviously
> serves a different purpose.
> I was asking because of an actual problem I am having, maybe one can
> help. The problem I am having is that I would like to group documents by
> a value, but only those documents in a certain time interval. In this
> scenario couchdb is used for logging, which might not be a purpose
> couchdb initially is designed for.
> I came up with the following solution. Grouping by value (uri) and time
> using the group_level=1 and the start and end key like follow:
> /_temp_view?group=true&group_level=1&startkey=[1270826004.0]&endkey=[{},1270826011.0]
> and simply counting
> function(doc) { emit([doc.URI,doc.Time], 1 );
> Now experienced couchdb users might already see, that this results in
> all documents being grouped no difference of the time set in the start
> and end key. I needed some time to figure out why but finally realized
> the problem even so I can not explain it right and maybe I am totally
> wrong after all.
> So I thought I might help first mapping the documents by the time value
> and in a next step mapping and reducing it by the uri value. A different
> approach I came up with could be designing a 3 value for each document
> consisting of a conjunction of time and uri and working with that as the
> key!?
> Maybe and hopefully there is even a third approach I am not thinking of.
> I really appreciate the help.
> Thanks

I forgot about the possibility to make an if statement in the map phase

if( start < doc.Time < end){ emit ... }

But I simply dont know and asked early on this list how to pass such a
parameter (end & start) to a permanent view / design document using


View raw message