incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Markus Jelsma <mar...@buyways.nl>
Subject Re: M/R/M, again
Date Fri, 12 Feb 2010 12:52:25 GMT
Paul,

>Markus,
>
>Sorry its taken me so long to sit down and tap out a reply.
>So, as to M/R/M, it turns out to be quite a bit harder to keep the
>same semantics of incremental view updates as well as the same reduce
>semantics when moving to a 'pure' implementation inside the view
>engine. Specifically, subsequent map's would need to be updated at the
>same time as the first map or we would need to add an update sequence
>like the main database has. Neither of these is a very good solution
>IMO. Also, the reduce semantics make it hard to hook subsequent M/R
>steps up to a view because of how reduces are implemented. The fix
>would require making reductions be persisted to a b~tree and then we'd
>need to pre-declare group_levels some how. Quite a bit of work. Also,
>because we aren't a 'Google M/R' implementation that guarantees 1
>unique key after each M/R stage the merge step becomes less trivial
>than the original M/R/M paper.

Sounds a bit like a lot of tough work for a feature that has quite some decent 
- although less elegant - work arounds. I still can choose for executing two 
seperate requests or merging all the data into one single document.

>These hurdles aren't insurmountable, but the longer I looked at the
>issues the more I thought that I would probably just end up writing a
>new indexer that has a slightly different M/R model to allow for such
>things. And then promptly never got around to it.
>However I have been trying to figure out how to create a CouchDB
>version of Riak's Jaywalker feature. It could do similar things to
>what you're wanting, but there are a couple problems that would put
>the hurt on cluster setups with the initial method I have in mind. And
>its a fairly decent sized addition so unless the implementation
>suddenly crystalizes into a simple solution I don't think it'll be in
>0.11 and hence 1.0.

It would obviously result in many difficulties in a cluster set up. I am using 
a sharded cluster on several machines and it would be quite a task to find out 
no which shard and which node a document resides that needs to be merged.

And ... Riak? Jaywalker? Can you explain? What is it, how is it related and 
and and... :)


>
>HTH,
>Paul Davis

Markus Jelsma - Technisch Architect - Buyways BV
http://www.linkedin.com/in/markus17
050-8536620 / 06-50258350


Mime
View raw message