hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Runping Qi" <runp...@yahoo-inc.com>
Subject RE: performance of multiple map-reduce operations
Date Wed, 07 Nov 2007 04:59:38 GMT

The o.a.h.m.jobcontrol.JobControl  class allows you to build a
dependency graph of jobs and submit them.


> -----Original Message-----
> From: Joydeep Sen Sarma [mailto:jssarma@facebook.com]
> Sent: Tuesday, November 06, 2007 8:20 PM
> To: hadoop-user@lucene.apache.org
> Subject: RE: performance of multiple map-reduce operations
> Stating the obvious - the issue is starting the next mapper using
> outputs from a reduce that is not yet complete. Currently the reduce
> succeeds all or nothing and partial output is not available. If this
> behavior is changed one has to make sure:
> - partial reduce outputs are cleaned up in case of eventual failure of
> reduce job (and I guess kill the next map job as well)
> - running the next map job in a mode where it's input files arrive
> time (and are not statically fixed at launch time).
> which don't seem like a small change.
> Even without such optimizations - would love it if one could submit a
> dependency graph of jobs to the job tracker (instead of arranging it
> from outside). (But i digress).
> -----Original Message-----
> From: redpony@gmail.com [mailto:redpony@gmail.com] On Behalf Of Chris
> Dyer
> Sent: Tuesday, November 06, 2007 5:05 PM
> To: hadoop-user@lucene.apache.org
> Subject: Re: performance of multiple map-reduce operations
> On 11/6/07, Doug Cutting <cutting@apache.org> wrote:
> >
> > Joydeep Sen Sarma wrote:
> > > One of the controversies is whether in the presence of failures,
> this
> > > makes performance worse rather than better (kind of like udp vs.
> -
> > > what's better depends on error rate). The probability of a failure
> per
> > > job will increase non-linearly as the number of nodes involved per
> job
> > > increases. So what might make sense for small clusters may not
> > > sense for bigger ones. But it sure would be nice to have this
> option.
> >
> > Hmm.  Personally I wouldn't put a very high priority on complicated
> > features that don't scale well.
> I don't think this is necessarily that what we're asking for is
> complicated,
> and I think it can be made to scale quite well.  The critique is
> appropriate
> for the current "polyreduce" prototype, but that is a particular
> solution to
> a general problem.  And *scaling* is exactly the issue that we are
> trying to
> address here (since the current model is quite wasteful of
> resources).
> As for complexity, I do agree that the current prototype that Joydeep
> linked
> to is needlessly cumbersome, but, to solve this problem theoretically,
> the
> only additional information that would need to be represented to a
> scheduler
> to plan a composite job is just a dependency graph between the mappers
> and
> reducers.   Dependency graphs are about as clear as it gets in this
> business, and they certainly aren't going to have any scaling problems
> and
> conceptual problems.
> I don't really see how failures would be more catastrophic if this is
> implemented purely in terms of smarter scheduling (which is adamantly
> not
> what the current polyreduce prototype does).  You're running the exact
> same
> jobs you would be running otherwise, just sooner. :)
> Chris

View raw message