incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James Marca <>
Subject Re: Chaining of views/MapReduce
Date Mon, 22 Feb 2010 18:16:23 GMT
On Fri, Feb 19, 2010 at 10:10:23AM -0500, J Chris Anderson wrote:
> On Feb 17, 2010, at 5:29 PM, Norman Rosner wrote:
> > 
> > On 17.02.2010, at 23:15, Mario Scheliga wrote:
> > 
> >> Hi Norman,
> >> 
> >> updating a document from map-function its not possible and seems to be the wrong
> >> Thinking of map function processing docs seperatly (sandbox), so you are able
> >> spread the execution over thousand of servers ;-)
> > 
> > True that! But: suppose I'm just creating/updating one document per couchdb-instance,
that should be ok, right? Because after that, I can easily get all the result documents and
merge them together. I would do it in as similar way in Hadoop. And as far as I read in the
loooong archives of this list, I'm not the only one who wants to do such things. 
> The "proper" way to do this is to have a simple CouchDB map reduce view that is the 1st
phase of your chain.
> Then query the view with group=true and store the output into an empty db (one document
per row).
> Now you can write another view on top of the derived db to do the second phase (sort
by value, etc).

Forgive me in advance, I have no erlang skills and no time or ability
to submit a patch, but I have to ask.  Are there any plans in the
development roadmap to make this less a kludge and more a core

I see two problems with the current proper way.  First, it seems
wasteful of disk space to have a view generated and then store
essentially the same thing as a separate db.  Second and more
importantly, as a developer you have to write long-lasting code that
pays attention to the source database to update the chain of
view->db->view->db...->view when the source db data changes.  It would
be nicer if CouchDB could manage all that internally.  Perhaps the map
code could explicitly dump to a db, maybe something like emit_chained
with a required target db as a third argument, so that changes to the
source database can get cascaded automatically.


This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

View raw message