hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tom White (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-3451) Port Fair Scheduler to MR2
Date Fri, 06 Apr 2012 20:17:24 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-3451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13248692#comment-13248692
] 

Tom White commented on MAPREDUCE-3451:
--------------------------------------

Overall this looks like a great addition. Initial review feedback:

* It looks like what were called "pools" in the MR1 version are now "queues", to fit with
the term that the capacity scheduler uses. Is that correct? If so, then the fair scheduler
code should use the same term throughout.
* What's the locking order for the fair scheduler classes and the classes that call them?
It might be worth documenting it somewhere in the code.
* I noticed that not all the unit tests from MR1 have been ported. Are you planning on including
the rest? 
* Configuration strings are duplicated and spread through different classes. How about creating
a FairSchedulerConfiguration class to hold all the constants (like capacity scheduler does)?
* Seeing that the configuration is changing compared to MR1, is it worth changing the allocations
file to be in Hadoop configuration format? Having one less format to deal with would be an
improvement IMO.
* Annotate all classes as @Private @Unstable. Some are not annotated at all, even though they
should be considered private.
* PoolSchedulable.assignContainer has a log line which is called frequently ("Node offered
to pool"), so should really be debug level.
* The Scheduler UI link gives a 500 error - I know you plan on fixing this, but I would have
expected a 404.
* Nit: PoolManager has two license headers
* Nit: Resources class spelling: "Mutliply"
* Nit: SchedulerApp - schedulingOpportunities comment has been changed from javadoc comment
in MR1 - change back?

What testing have you done? I tried it on a pseudo distributed cluster and successfully ran
a job with the fair scheduler.

This will need some documentation (in hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/)
- are you planning on doing that as a separate JIRA? It would also be useful to call out the
configuration differences compared to the MR1 fair scheduler. (E.g. I noticed that yarn.scheduler.fair.user-as-default-queue
is new.)

                
> Port Fair Scheduler to MR2
> --------------------------
>
>                 Key: MAPREDUCE-3451
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3451
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: mrv2, scheduler
>            Reporter: Patrick Wendell
>            Assignee: Patrick Wendell
>         Attachments: MAPREDUCE-3451.v1.patch.txt, MAPREDUCE-3451.v2.patch.txt
>
>
> The Fair Scheduler is in widespread use today in MR1 clusters, but not yet ported to
MR2. This is to track the porting of the Fair Scheduler to MR2 and will be updated to include
design considerations and progress.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message