hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Bieniosek <mich...@powerset.com>
Subject Re: Choosing a task for the TT to run
Date Fri, 23 May 2008 20:38:45 GMT
I've wondered about this too -- if there aren't enough tasks for all the
tasktrackers, sometimes it might be better to leave some tasktrackers
completely idle.  These idle tasktrackers might then be reclaimed by some
higher-level resource scheduler.  There might also be some power consumption
implications for idle machines.

At least, it would be nice to have this behavior configurable.

-Michael

On 5/23/08 3:43 AM, "Vivek Ratan" <vivekr@yahoo-inc.com> wrote:

> I'm curious about some code in JobTracker::getNewTaskForTaskTracker().
> This is the method that decides what task to give a free TT.
>
> For each call, the JT first figures out the 'remaining load' per TT for
> maps/reduces (which is the total number of map and reduce tasks that
> need to be run across all running jobs, divided by the num of TTs). It
> then figures out how many maximum map or reduce tasks should be run on
> the TT (which is the minimum of the TT's capacity and the 'remaining
> load') - call this the 'max load'. Finally, if a TT can run something
> (ie, if the # of maps/reduces it is running is less than the 'max
> load'), it looks to give it a map task or a reduce task.
>
> My questions:
>
> 1. I'm wondering why we need to do all this. It's probably to spread the
> tasks across slots. For eg., you may want to avoid the case where if
> there are 5 TTs, each with 10 free slots, and you have 30 tasks, you
> fill up all 10 slots each of the first three TTs, leaving the other two
> TTs under-utilized. Is this correct?
>
> 2. Suppose I have two TTs, each with 4 slots for Maps (it's the sme for
> reduces - doesn't really matter). Let's suppose each TT is already
> running two Maps each, and let's suppose there are 2 maps remaining to
> be run. Then it appears that those two remaining Maps cannot be run on
> either TT (for each TT, 'numMaps' = 2, and 'maxMapLoad' = 1, so the TT
> will not be given a task). The JT will have to wait till the 2 running
> Maps on each TT finsih, before it can allocate another Map, so the two
> unused slots in each TT go waste.
>
> 3. By balancing out the tasks, don't you run the risks of not fully
> utlizing faster nodes? I understand that the max load is calculated at
> each iteration, but the whole idea of balancing assumes some homegeneity
> among machines, which doesn't seem like a fair assumption.
>
> It just seems easier (both to code and understand) if you always give a
> TT what it asks for. If it has a free Map/Reduce slot, you give it a
> Map/Reduce task to run. if grids are typically always running at full
> capacity (which seems to be the case a lot of times), then greedily
> filling up a slot seems simple and correct - there are always more tasks
> to run than slots, and you'll always fill up slots and achieve good
> utilization. There doesn't seem to be any need for balancing.
>
> Or maybe I'm missing the real reason why we need to look at workload
> averages.


Mime
View raw message