hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Zhenhua Guo <jen...@gmail.com>
Subject Re: mapper and reducer scheduling
Date Mon, 01 Nov 2010 02:49:46 GMT
Thanks!
One more question. Is the input file replicated on each node where a
mapper is run? Or just the portion processed by a mapper is
transferred?

Gerald

On Fri, Oct 29, 2010 at 10:11 AM, Harsh J <qwertymaniac@gmail.com> wrote:
> Hello,
>
> On Fri, Oct 29, 2010 at 12:45 PM, Jeff Zhang <zjffdu@gmail.com> wrote:
>> TaskTracker will tell JobTracker how many free slots it has through
>> heartbeat. And JobTracker will choose the best tasktracker with the
>> consideration of data locality.
>
> Yes. To add some more, a scheduler is responsible to do assignments of
> tasks (based on various stats, including data locality) to proper
> tasktrackers. Scheduler.assignTasks(TaskTracker) is used to assign a
> TaskTracker its tasks, and the scheduler type is configurable (Some
> examples are Eager/FIFO scheduler, Capacity scheduler, etc.).
>
> This scheduling is done when a heart beat response is to be sent back
> to a TaskTracker that called JobTracker.heartbeat(...).
>
>>
>>
>> On Thu, Oct 28, 2010 at 2:52 PM, Zhenhua Guo <jenvor@gmail.com> wrote:
>>> Hi, all
>>>  I wonder how Hadoop schedules mappers and reducers (e.g. consider
>>> load balancing, affinity to data?). For example, how to decide on
>>> which nodes mappers and reducers are to be executed and when.
>>>  Thanks!
>>>
>>> Gerald
>>>
>>
>>
>>
>> --
>> Best Regards
>>
>> Jeff Zhang
>>
>
>
>
> --
> Harsh J
> www.harshj.com
>

Mime
View raw message