hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Patrick Wendell (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-3451) Port Fair Scheduler to MR2
Date Sat, 19 May 2012 07:04:10 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-3451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13279465#comment-13279465

Patrick Wendell commented on MAPREDUCE-3451:

I'm not sure I understand what it would mean to have a single featureful scheduler.

There is already an abstraction layer built into YARN to allow multiple scheudling policies.
Aside from the logic specific to the policy itself, the code for dealing with resource management
is re-used amongst schedulers. That is, when someone is using the Fair Scheduler (vs the Capacity
or FIFO scheduler) they are using mostly the same RM code save for the scheduling logic itself.
Furthermore, several shared classes are used by both the Fair and Capacity schedulers, such
as the Queue interface and the SchedulerApp class - these common classes are "tested" by users
of both schedulers.

The alternative suggested seems to be having a monolithic scheduler class that can enact different
policies depending on some configuration option. Any sensible implementation of that approach
would abstract out the scheduling policy and I think you'd get what's already there now.

Or maybe the idea is that everyone using Hadoop will want the same scheduling policy (modulo
minor configurations like timeouts, capacities, etc.). That seems unlikely to me given the
diversity of Hadoop deployments and the fact that both of the currently available schedulers
in MR1 are widely used.

Also, just a note - adding preemption to the capacity scheduler will not make it equal the
Fair Scheduler. These are fundamentally different approaches which will become substantailly
more different when multiple resource types are added to the YARN RM stack. 

I strongly agree that having single implementations of common classes is a win in terms of
support and testability - but I think this implementation mostly acheives that goal. 
> Port Fair Scheduler to MR2
> --------------------------
>                 Key: MAPREDUCE-3451
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3451
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: mrv2, scheduler
>            Reporter: Patrick Wendell
>            Assignee: Patrick Wendell
>         Attachments: MAPREDUCE-3451.v1.patch.txt, MAPREDUCE-3451.v2.patch.txt, MAPREDUCE-3451.v3.patch.txt,
MAPREDUCE-3451.v4.patch.txt, MAPREDUCE-3451.v5.patch
> The Fair Scheduler is in widespread use today in MR1 clusters, but not yet ported to
MR2. This is to track the porting of the Fair Scheduler to MR2 and will be updated to include
design considerations and progress.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message