hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rahul Bhattacharjee <rahul.rec....@gmail.com>
Subject Re: Hadoop schedulers!
Date Tue, 14 May 2013 02:14:12 GMT
Thanks a lot for the replies , it was really helpful.


On Tue, May 14, 2013 at 1:02 AM, Alok Kumar <alokawi@gmail.com> wrote:

> Hi,
>
> As the name suggest, Fair-scheduler does a fair allocation of slot to the
> jobs.
> Let say, you have 10 map slots in your cluster and it is occupied by a
> job-1 which requires 30 map slot to finish. But the same time, another
> job-2 require only 2 map slots to finish - Here slots will be provided to
> job-2 to get finished quickly while job-1 will be keep running.
>
>
>
> On Tue, May 14, 2013 at 12:02 AM, Rahul Bhattacharjee <
> rahul.rec.dgp@gmail.com> wrote:
>
>> Any pointer to my question.
>>
>> There is another question , kind-of dumb , but just wanted to clarify.
>>
>> Say in a FIFO scheduler or a capacity scheduler , if there are slots
>> available and the first job doesn't need all of the available slots , then
>> the job next in the queue is scheduled for execution or that still waits
>> for the first job to finish?
>>
>
> - Jobs don't wait for all the slots to get freed. Execution will start as
> soon as it get a slot. However, Hadoop does its best to allot a slot where
> job can achieve data locality.
>
>
>
>>  Thanks,
>> Rahul
>>
>>
>> On Sat, May 11, 2013 at 8:31 PM, Rahul Bhattacharjee <
>> rahul.rec.dgp@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> I was going through the job schedulers of Hadoop and could not see any
>>> major operational difference between the capacity scheduler and the fair
>>> share scheduler apart from the fact that fair share scheduler supports
>>> preemption and capacity scheduler doesn't.
>>>
>>> Another thing is the former creates logical pools based on certain
>>> attribute like username , user group etc and the later has a notion of job
>>> queues. Can someone point me to any other major differences between these
>>> two types of schedulers.
>>>
>>> Another question in this regard is the capacity scheduler uses a FIFO
>>> queue.So its still possible that a high priority long running job using all
>>> the capacity allocated to the queue to block all the other jobs after it in
>>> the queue.I think this is the expected behavior , but wanted to confirm.
>>>
>>> Thanks,
>>> Rahul
>>>
>>>
>>>
>>
>
> Thanks
> --
> Alok
>

Mime
View raw message