hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yaron Gonen <yaron.go...@gmail.com>
Subject Re: Selecting a task for the tasktracker
Date Thu, 27 Dec 2012 21:18:41 GMT
Thanks a lot!

On Thu, Dec 27, 2012 at 8:11 PM, Vinod Kumar Vavilapalli <
vinodkv@hortonworks.com> wrote:

> On top of that, the message indicates that you need to have your scheduler
> class in the mapred package.
> Thanks,
> +Vinod Kumar Vavilapalli
> Hortonworks Inc.
> http://hortonworks.com/
> On Dec 27, 2012, at 7:38 AM, Hemanth Yamijala wrote:
> Hi,
> Firstly, I am talking about Hadoop 1.0. Please note that in Hadoop 2.x and
> trunk, the Mapreduce framework is completely revamped to Yarn (
> http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YARN.html)
> and you may need to look at different interfaces for building your own
> scheduler.
> In 1.0, the primary function of the TaskScheduler is the assignTasks
> method. Given a TaskTracker object as input, this method figures out how
> many free map and reduce slots exist in that particular tasktracker and
> selects one or more task that can be scheduled on it. Since task selection
> is the primary responsibility and the granularity is at a task level, the
> class is called TaskScheduler.
> The method of choosing a job and then a task within the job is customised
> by the different schedulers already present in Hadoop. Also, the core logic
> of selecting a map task with data locality optimizations is not implemented
> in the schedulers per se, but they rely on the JobInProgress object in
> MapReduce framework for achieving the same.
> To implement your own Scheduler, it may be best to look at the sources of
> existing schedulers: JobQueueTaskScheduler, CapacityTaskScheduler or
> FairScheduler.  In particular, the last two are in the contrib modules of
> mapreduce, and hence will be fairly independent to follow. Their build
> files will also tell you how to resolve any compile problems like the one
> you are facing.
> Thanks
> Hemanth
> On Thu, Dec 27, 2012 at 4:10 PM, Yaron Gonen <yaron.gonen@gmail.com>wrote:
>> Hi,
>> If I understand correctly, the job scheduler (why is the class called
>> TaskScheduler?) is responsible for assigning the task whose split is as
>> close as possible to the tasktacker.
>>  Meaning that the job scheduler is responsible to two things:
>>    1. Selecting a job.
>>    2. Once a job is selected, assign the closest task to the tasktracker
>>    that send the heartbeat.
>> Is this correct?
>> I want to write my own job scheduler to change the logic above, but it
>> says The type TaskScheduler is not visible.
>> How can I write my own scheduler?
>> thanks

View raw message