hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hudson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-201) CapacityScheduler can take a very long time to schedule containers if requests are off cluster
Date Thu, 08 Nov 2012 12:43:12 GMT

    [ https://issues.apache.org/jira/browse/YARN-201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13493144#comment-13493144
] 

Hudson commented on YARN-201:
-----------------------------

Integrated in Hadoop-Hdfs-0.23-Build #429 (See [https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/429/])
    Merge -c 1406834 from trunk to branch-2 to fix YARN-201. Fix CapacityScheduler to be less
conservative for starved off-switch requests. Contributed by Jason Lowe. (Revision 1406836)

     Result = SUCCESS
acmurthy : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1406836
Files : 
* /hadoop/common/branches/branch-0.23/hadoop-yarn-project/CHANGES.txt
* /hadoop/common/branches/branch-0.23/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java
* /hadoop/common/branches/branch-0.23/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestLeafQueue.java

                
> CapacityScheduler can take a very long time to schedule containers if requests are off
cluster
> ----------------------------------------------------------------------------------------------
>
>                 Key: YARN-201
>                 URL: https://issues.apache.org/jira/browse/YARN-201
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: capacityscheduler
>    Affects Versions: 0.23.3, 2.0.1-alpha
>            Reporter: Jason Lowe
>            Assignee: Jason Lowe
>            Priority: Critical
>             Fix For: 2.0.3-alpha, 0.23.5
>
>         Attachments: YARN-201.patch, YARN-201.patch
>
>
> When a user runs a job where one of the input files is a large file on another cluster,
the job can create many splits on nodes which are unreachable for computation from the current
cluster.  The off-switch delay logic in LeafQueue can cause the ResourceManager to allocate
containers for the job very slowly.  In one case the job was only getting one container every
23 seconds, and the queue had plenty of spare capacity.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message