hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hadoop QA (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-7249) Fix CapacityScheduler NPE issue when a container preempted while the node is being removed
Date Tue, 26 Sep 2017 00:40:00 GMT

    [ https://issues.apache.org/jira/browse/YARN-7249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16180007#comment-16180007
] 

Hadoop QA commented on YARN-7249:
---------------------------------

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 25s{color} | {color:blue}
Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  0s{color} |
{color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  0s{color} | {color:red}
The patch doesn't appear to include any new or modified tests. Please justify why no new tests
are needed for this patch. Also please list what manual steps were performed to verify this
patch. {color} |
|| || || || {color:brown} branch-2.8 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 51s{color}
| {color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 34s{color} |
{color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 22s{color}
| {color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 40s{color} |
{color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 18s{color} |
{color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 26s{color} |
{color:green} branch-2.8 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 32s{color}
| {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 31s{color} |
{color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 31s{color} | {color:green}
the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 19s{color}
| {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 40s{color} |
{color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m  0s{color}
| {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 25s{color} |
{color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 25s{color} |
{color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 91m 56s{color} | {color:red}
hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 16s{color}
| {color:green} The patch does not generate ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black}109m 10s{color} | {color:black}
{color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.yarn.server.resourcemanager.TestAMAuthorization |
|   | hadoop.yarn.server.resourcemanager.TestClientRMTokens |
| Timed out junit tests | org.apache.hadoop.yarn.server.resourcemanager.TestRMHA |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:c2d96dd |
| JIRA Issue | YARN-7249 |
| JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12888909/YARN-7249.branch-2.8.001.patch
|
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  unit  findbugs
 checkstyle  |
| uname | Linux 2db572d674f0 3.13.0-129-generic #178-Ubuntu SMP Fri Aug 11 12:48:20 UTC 2017
x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh |
| git revision | branch-2.8 / ea7e655 |
| Default Java | 1.7.0_151 |
| findbugs | v3.0.0 |
| unit | https://builds.apache.org/job/PreCommit-YARN-Build/17632/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
|
|  Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/17632/testReport/ |
| modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager |
| Console output | https://builds.apache.org/job/PreCommit-YARN-Build/17632/console |
| Powered by | Apache Yetus 0.6.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Fix CapacityScheduler NPE issue when a container preempted while the node is being removed
> ------------------------------------------------------------------------------------------
>
>                 Key: YARN-7249
>                 URL: https://issues.apache.org/jira/browse/YARN-7249
>             Project: Hadoop YARN
>          Issue Type: Bug
>    Affects Versions: 2.8.1
>            Reporter: Wangda Tan
>            Assignee: Wangda Tan
>            Priority: Blocker
>         Attachments: YARN-7249.branch-2.8.001.patch
>
>
> This issue could happen when 3 conditions satisfied:
> 1) A node is removing from scheduler.
> 2) A container running on the node is being preempted. 
> 3) A rare race condition causes scheduler pass a null node to leaf queue.
> Fix of the problem is to add a null node check inside CapacityScheduler.
> Stack trace:
> {code}
> 2017-08-31 02:51:24,748 FATAL resourcemanager.ResourceManager (ResourceManager.java:run(714))
- Error in handling event type KILL_RESERVED_CONTAINER to the scheduler 
> java.lang.NullPointerException 
> at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.completedContainer(LeafQueue.java:1308)

> at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.completedContainerInternal(CapacityScheduler.java:1469)

> at org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.completedContainer(AbstractYarnScheduler.java:497)

> at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.killReservedContainer(CapacityScheduler.java:1505)

> at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1341)

> at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:127)

> at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:705)

> {code}
> This is an issue only existed in 2.8.x



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message