hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stu Hood" <stuh...@webmail.us>
Subject RE: Tech Talk: Dryad
Date Fri, 09 Nov 2007 18:21:38 GMT
Yea, near the end of his posting, Vuk mentioned that they had considered adding checkpointing
(by asynchronously storing the output of a R1|M2 to disk while it was being piped to R2),
but didn't get around to it.


-----Original Message-----
From: Joydeep Sen Sarma <jssarma@facebook.com>
Sent: Friday, November 9, 2007 1:12pm
To: hadoop-user@lucene.apache.org, stuhood@webmail.us
Subject: RE: Tech Talk: Dryad

I think we have to thing harder about how to address the problems with
managing errors and keeping track of too much state/rolling back etc.
This field is new to me - but I do remember from grad school that
checkpointing is a very relevant and researched topic in parallel
computing in general (which is really what the commit to hdfs between
one reduce and the next map does). 

(pretty vague - will try to do some reading when I find some time :-))

-----Original Message-----
From: Stu Hood [mailto:stuhood@webmail.us] 
Sent: Friday, November 09, 2007 10:03 AM
To: hadoop-user@lucene.apache.org
Subject: Re: Tech Talk: Dryad

I did read the conclusion of the previous thread, which was that nobody
"thought" that the performance gains would be worth the added
complexity. I simply think that if a patch is available, the developer
should be encouraged to submit it for review, since the topic has been
discussed so frequently.

I think our concept of "the" map/reduce primitive has been limited in
scope to the capabilities that Google described. There is no reason not
to explore potentially beneficial additions (even if Google didn't think
they were worthwhile).

Yes, Dryad is more confusing, because it is using a more flexible
primitive. I'm not suggesting that Hadoop should be rewritten to use a
DAG at its core, but we do already have the
o.a.h.m.jobcontrol.JobControl module, so _somebody_ must think the
concept is useful.

Re: Side note: As the presenter explained, he uses a small example first
to demonstrate the linear speedup. Next (~32 minute) he goes to an
example of sorting 10TB on 1800 machines in ~12 minutes...


-----Original Message-----
From: Owen O'Malley <oom@yahoo-inc.com>
Sent: Friday, November 9, 2007 12:32pm
To: hadoop-user@lucene.apache.org
Subject: Re: Tech Talk: Dryad

On Nov 9, 2007, at 8:49 AM, Stu Hood wrote:

> Currently there is no sanctioned method of 'piping' the reduce  
> output of one job directly into the map input of another (although  
> it has been discussed: see the thread I linked before: http:// 
> www.nabble.com/Poly-reduce--tf4313116.html ).

Did you read the conclusion of the previous thread? The performance  
gains in avoiding the second map input are trivial compared the gains  
in simplicity of having a single data path and re-execution story.  
During a reasonably large job, roughly 98% of your maps are reading  
data on the _same_ node. Once we put in rack locality, it will be  
even better.

I'd much much rather build the map/reduce primitive and support it  
very well than add the additional complexity of any sort of poly- 
reduce. I think it is very appropriate for systems like Pig to  
include that kind of optimization, but it should not be part of the  
base framework.

I watched the front of the Dryad talk and was struck by how complex  
it quickly became. It does give the application writer a lot of  
control, but to do the equivalent of a map/reduce sort with 100k maps  
and 4k reduces with automatic spill-over to disk during the shuffle  
seemed _really_ complicated.

On a side note, in the part of the talk that I watched, the scaling  
graph went from 2 to 9 nodes. Hadoop's scaling graphs go to 1000's of  
nodes. Did they ever suggest later in the talk that it scales up higher?

-- Owen

View raw message