hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doug Cutting <cutt...@apache.org>
Subject Re: Task type priorities during scheduling ?
Date Mon, 24 Jul 2006 08:09:03 GMT
Eric Baldeschwieler wrote:
> Of course interleaving the sort with the copy phase would also be 
> desirable...
> But I'm all for clearly IDing reduces vs shuffle.

I think this is mostly a terminology problem.

There is a 1:1 correspondence between shuffle tasks and reduce tasks, 
and a strict ordered dependency.  There's no advantage in trying to 
separate their implementations: we need to start a thread to manage 
first a shuffle and then, immediately after, if the shuffle suceeds, a 
reduce.  So this may as well be the same thread.

So I don't think we need a ShuffleTask class, separately scheduled by 
the TaskTracker, but, rather, we just need to start calling the first 
part of the reduce task progress "shuffle".  Thus the fix is only to 
progress reporting code.


View raw message