hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vivek Ratan (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HADOOP-5199) A proposal to merge common functionality of various Schedulers
Date Wed, 11 Feb 2009 07:49:02 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-5199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Vivek Ratan updated HADOOP-5199:

    Attachment: 5199.2.patch

Attaching another patch (5199.2.patch). This one, in addition, supports a Faishare pool, which
includes most, if not all, of the functionality of the Fairshare scheduler. The main purpose
of this patch is to demonstrate how the base scheduler can work with a fairshare-style ordering
of jobs as well as a capacity/default-style ordering. The patch is not complete. 

To support the fairshare pool ({{FairsharePool.java}}),  I had to do the following: 
* The main change is that each {{FairsharePool}} object handles a single pool and orders jobs
in that pool the same way that the Fairshare scheduler does (the latter ordered jobs globally,
while {{FairsharePool}} does it for jobs in a pool. This follows my earlier suggestion to
let the scheduler first pick a pool and then look at the ordering of jobs in that pool. It's
easy enough for {{FairsharePool}} to go back to a global ordering by using a static map of
all jobs, rather than a map of pool-specific jobs. 
* I did not include any of the configurable classes/interfaces such as {{TaskSelector}} or
{{WesightAdjuster}}. Those can be added later and are not directly relevant to this discussion.

* I got rid of the 'runnable' concept, since user limits will supposedly be handled by the
code {{HadoopTaskScheduler}}. 
* Similarly, _updateWeights()_ does not look at normalizing wights across pools, as the suggestion
is to pick a pool based on how far behind the pool is running. 

Again, this patch is to just demonstrate how a fairshare pool can fit into the core scheduler
without losing any/much functionality. This patch hopefully also shows how different data
structures and information are spread out across the classes. 

> A proposal to merge common functionality of various Schedulers
> --------------------------------------------------------------
>                 Key: HADOOP-5199
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5199
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Vivek Ratan
>         Attachments: 5199.1.patch, 5199.2.patch
> There are at least 3 Schedulers in Hadoop today: Default, Capacity, and Fairshare. Over
time, we're seeing a lot of functionality common to all three. Many bug fixes, improvements
to existing functionality, and new functionality are applicable to all three schedulers. This
trend seems to be getting stronger, as we notice similar problems, solutions, and ideas. This
is a proposal to detect and consolidate such common functionality.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message