couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James Marca <jma...@translab.its.uci.edu>
Subject Re: Chaining of views/MapReduce
Date Mon, 22 Feb 2010 18:20:39 GMT
On Mon, Feb 22, 2010 at 10:16:23AM -0800, James Marca wrote:
> On Fri, Feb 19, 2010 at 10:10:23AM -0500, J Chris Anderson wrote:
> > 
> > On Feb 17, 2010, at 5:29 PM, Norman Rosner wrote:
> > 
> > > 
> > > On 17.02.2010, at 23:15, Mario Scheliga wrote:
> > > 
> > >> Hi Norman,
> > >> 
> > >> updating a document from map-function its not possible and seems to be
the wrong way.
> > >> Thinking of map function processing docs seperatly (sandbox), so you are
able to
> > >> spread the execution over thousand of servers ;-)
> > > 
> > > True that! But: suppose I'm just creating/updating one document per couchdb-instance,
that should be ok, right? Because after that, I can easily get all the result documents and
merge them together. I would do it in as similar way in Hadoop. And as far as I read in the
loooong archives of this list, I'm not the only one who wants to do such things. 
> > 
> > 
> > The "proper" way to do this is to have a simple CouchDB map reduce view that is
the 1st phase of your chain.
> > 
> > Then query the view with group=true and store the output into an empty db (one document
per row).
> > 
> > Now you can write another view on top of the derived db to do the second phase (sort
by value, etc).
> 
> Forgive me in advance, I have no erlang skills and no time or ability
> to submit a patch, but I have to ask.  Are there any plans in the
> development roadmap to make this less a kludge and more a core
> feature?  

Okay, double apologies, I just saw the longer thread hashing out this
topic.

Back to lurking silently.
James

> 
> I see two problems with the current proper way.  First, it seems
> wasteful of disk space to have a view generated and then store
> essentially the same thing as a separate db.  Second and more
> importantly, as a developer you have to write long-lasting code that
> pays attention to the source database to update the chain of
> view->db->view->db...->view when the source db data changes.  It would
> be nicer if CouchDB could manage all that internally.  Perhaps the map
> code could explicitly dump to a db, maybe something like emit_chained
> with a required target db as a third argument, so that changes to the
> source database can get cascaded automatically.
> 
> Regards,
> James
> 
> -- 
> This message has been scanned for viruses and
> dangerous content by MailScanner, and is
> believed to be clean.


-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


Mime
View raw message