hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "zhihai xu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-4440) FSAppAttempt#getAllowedLocalityLevelByTime should init the lastScheduler time
Date Mon, 14 Dec 2015 22:57:46 GMT

    [ https://issues.apache.org/jira/browse/YARN-4440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15056895#comment-15056895
] 

zhihai xu commented on YARN-4440:
---------------------------------

Good catch! thanks for working on this issue [~linyiqun]!
+1 for the latest patch, The test failures are not related to the patch, These failures were
already reported at YARN-4318 and YARN-4306.
Will commit it tomorrow if no one objects.


> FSAppAttempt#getAllowedLocalityLevelByTime should init the lastScheduler time
> -----------------------------------------------------------------------------
>
>                 Key: YARN-4440
>                 URL: https://issues.apache.org/jira/browse/YARN-4440
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: fairscheduler
>    Affects Versions: 2.7.1
>            Reporter: Lin Yiqun
>            Assignee: Lin Yiqun
>         Attachments: YARN-4440.001.patch, YARN-4440.002.patch, YARN-4440.003.patch
>
>
> It seems there is a bug on {{FSAppAttempt#getAllowedLocalityLevelByTime}} method
> {code}
> // default level is NODE_LOCAL
>     if (! allowedLocalityLevel.containsKey(priority)) {
>       allowedLocalityLevel.put(priority, NodeType.NODE_LOCAL);
>       return NodeType.NODE_LOCAL;
>     }
> {code}
> If you first invoke this method, it doesn't init  time in lastScheduledContainer and
this will lead to execute these code for next invokation:
> {code}
>     // check waiting time
>     long waitTime = currentTimeMs;
>     if (lastScheduledContainer.containsKey(priority)) {
>       waitTime -= lastScheduledContainer.get(priority);
>     } else {
>       waitTime -= getStartTime();
>     }
> {code}
> the waitTime will subtract to FsApp startTime, and this will be easily more than the
delay time and allowedLocality degrade. Because FsApp startTime will be start earlier than
currentTimeMs. So we should add the initial time of priority to prevent comparing with FsApp
startTime and allowedLocalityLevel degrade. And this problem will have more negative influence
for small-jobs. The YARN-4399 also discuss some problem in aspect of locality.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message