hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <ha...@cloudera.com>
Subject Re: resource based scheduling - fairscheduler
Date Wed, 18 Apr 2012 06:01:17 GMT
Yes, I've implemented simple-enough LoadManagers for some experiments.
Its not very hard to implement one for simple needs.

The LoadManager in 0.20/1.0 is more of a "decider". You only need to
return true/false depending on the two calls below:

  public abstract boolean canAssignMap(TaskTrackerStatus tracker,
      int totalRunnableMaps, int totalMapSlots, int alreadyAssigned);

  public abstract boolean canAssignReduce(TaskTrackerStatus tracker,
      int totalRunnableReduces, int totalReduceSlots, int alreadyAssigned);

The TTS object gives you access to the TT's resources (CPU, Memory,
etc. - There's much you can get out of it, just read its API/source)
which can be used to influence your decision.

I don't think this would let you perform reservations like the CS
though. Its more of a "Can/should this TT accept more tasks at the
moment?" deciding class. See also the CapBasedLoadManager class for a
cap-implementation that helps FS distribute tasks across cluster
rather than completely fill up one TT after another.

On Wed, Apr 18, 2012 at 10:02 AM, Corbin Hoenes <corbin@tynt.com> wrote:
> mapred.fairscheduler.loadmanager - An extension point that lets you
> specify a class that determines how many maps and reduces can run on a
> given TaskTracker. This class should implement the LoadManager
> interface. By default the task caps in the Hadoop config file are
> used, but this option could be used to make the load based on
> available memory and CPU utilization for example.
> Has someone implemented a loadmanager that does something similar to
> capacity scheduler's resource base scheduling?  Even experimentally?

Harsh J

View raw message