Mailing-List: contact yarn-issues-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: yarn-issues@hadoop.apache.org
Date: Mon, 10 Sep 2012 17:09:10 +1100 (NCT)
From: "Karthik Kambatla (JIRA)" <jira@apache.org>
To: yarn-issues@hadoop.apache.org
Message-ID: <684534808.56611.1347257350401.JavaMail.jiratomcat@arcas>
Subject: [jira] [Commented] (YARN-80) Support delay scheduling for node
 locality in MR2's capacity scheduler
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


    [ https://issues.apache.org/jira/browse/YARN-80?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13451777#comment-13451777 ] 

Karthik Kambatla commented on YARN-80:
--------------------------------------

bq. Perhaps the better way to do this is to have the AM be responsible for making the requests at different times. So for example on the first heartbeat after a container is needed only the node local request is made. If it does not get it after a specific timeout (1 heartbeat by default) then a rack local request is added, and finally the global request is added after another timeout.

+1. Should we create a JIRA for this to make sure we don't miss out?
                
> Support delay scheduling for node locality in MR2's capacity scheduler
> ----------------------------------------------------------------------
>
>                 Key: YARN-80
>                 URL: https://issues.apache.org/jira/browse/YARN-80
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: capacityscheduler
>            Reporter: Todd Lipcon
>            Assignee: Arun C Murthy
>             Fix For: 2.0.2-alpha
>
>         Attachments: YARN-80.patch, YARN-80.patch
>
>
> The capacity scheduler in MR2 doesn't support delay scheduling for achieving node-level locality. So, jobs exhibit poor data locality even if they have good rack locality. Especially on clusters where disk throughput is much better than network capacity, this hurts overall job performance. We should optionally support node-level delay scheduling heuristics similar to what the fair scheduler implements in MR1.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira