couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kropp, Henning" <hkr...@microlution.de>
Subject Re: Multiple map reduce stages
Date Thu, 20 May 2010 09:01:58 GMT
Am 20.05.2010 10:48, schrieb Kropp, Henning:
> Am 18.05.2010 20:16, schrieb J Chris Anderson:
>   
>> On May 18, 2010, at 2:52 AM, Kropp, Henning wrote:
>>
>>   
>>     
>>> Hi,
>>>
>>> as far as I know working with map reduce commonly involves multiple map
>>> and reduce stages. A view in couchdb solely consists of one map and if
>>> necessary one reduce stage!? To have multiple map and reduce stages one
>>> would have to conjunct views in couchdb!? How can I do that? Is it
>>> possible to give the function(doc){..} another parameter? There is the
>>> shows which have the extra parameter req for the http request.
>>> Unfortunately my javascript knowledge of the underlaying Prototype
>>> concept is not very funded which could be helpful here?
>>>
>>> Kind regards and many thanks in advanced
>>>     
>>>       
>> CouchDB Map Reduce is a realtime incremental model, so it is quite different from
the Hadoop-style batch model. Of course you can still chain map reduce by copying the rows
from a view query to a new db, and writing another view on the new db.
>>
>> Chris
>>     
> That is interesting to know. Hive adopts the batch model but obviously
> serves a different purpose.
>
> I was asking because of an actual problem I am having, maybe one can
> help. The problem I am having is that I would like to group documents by
> a value, but only those documents in a certain time interval. In this
> scenario couchdb is used for logging, which might not be a purpose
> couchdb initially is designed for.
>
> I came up with the following solution. Grouping by value (uri) and time
> using the group_level=1 and the start and end key like follow:
>
> /_temp_view?group=true&group_level=1&startkey=[1270826004.0]&endkey=[{},1270826011.0]
>
> and simply counting
>
> function(doc) { emit([doc.URI,doc.Time], 1 );
>
> Now experienced couchdb users might already see, that this results in
> all documents being grouped no difference of the time set in the start
> and end key. I needed some time to figure out why but finally realized
> the problem even so I can not explain it right and maybe I am totally
> wrong after all.
>
> So I thought I might help first mapping the documents by the time value
> and in a next step mapping and reducing it by the uri value. A different
> approach I came up with could be designing a 3 value for each document
> consisting of a conjunction of time and uri and working with that as the
> key!?
>
> Maybe and hopefully there is even a third approach I am not thinking of.
> I really appreciate the help.
>
> Thanks
>
>   

I forgot about the possibility to make an if statement in the map phase
like

if( start < doc.Time < end){ emit ... }

But I simply dont know and asked early on this list how to pass such a
parameter (end & start) to a permanent view / design document using
couchdb.js.

regards 


Mime
View raw message