hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hadoop QA (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-9950) TestDecommissioningStatus fails intermittently in trunk
Date Sun, 13 Mar 2016 07:45:36 GMT

    [ https://issues.apache.org/jira/browse/HDFS-9950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15192184#comment-15192184
] 

Hadoop QA commented on HDFS-9950:
---------------------------------

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 10s {color} | {color:blue}
Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green}
The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color}
| {color:green} The patch appears to include 1 new or modified test files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 42s {color}
| {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 46s {color} |
{color:green} trunk passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 42s {color} |
{color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 18s {color}
| {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 51s {color} |
{color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 12s {color}
| {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 55s {color} |
{color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 14s {color} |
{color:green} trunk passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 53s {color} |
{color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 47s {color}
| {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 46s {color} |
{color:green} the patch passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 46s {color} | {color:green}
the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 40s {color} |
{color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 40s {color} | {color:green}
the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 17s {color}
| {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 49s {color} |
{color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 11s {color}
| {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color}
| {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 5s {color} |
{color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 8s {color} | {color:green}
the patch passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 48s {color} |
{color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 63m 10s {color} | {color:red}
hadoop-hdfs in the patch failed with JDK v1.8.0_74. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 57m 39s {color} | {color:red}
hadoop-hdfs in the patch failed with JDK v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 19s {color}
| {color:green} Patch does not generate ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 146m 25s {color} | {color:black}
{color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_74 Failed junit tests | hadoop.hdfs.server.namenode.TestEditLog |
|   | hadoop.hdfs.TestFileAppend |
|   | hadoop.hdfs.TestDFSUpgradeFromImage |
| JDK v1.7.0_95 Failed junit tests | hadoop.hdfs.server.namenode.ha.TestBootstrapStandbyWithQJM
|
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:0ca8df7 |
| JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12793182/HDFS-9950.001.patch
|
| JIRA Issue | HDFS-9950 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  unit  findbugs
 checkstyle  |
| uname | Linux b4fa6036c547 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12
UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 658ee95 |
| Default Java | 1.7.0_95 |
| Multi-JDK versions |  /usr/lib/jvm/java-8-oracle:1.8.0_74 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_95
|
| findbugs | v3.0.0 |
| unit | https://builds.apache.org/job/PreCommit-HDFS-Build/14805/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_74.txt
|
| unit | https://builds.apache.org/job/PreCommit-HDFS-Build/14805/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_95.txt
|
| unit test logs |  https://builds.apache.org/job/PreCommit-HDFS-Build/14805/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_74.txt
https://builds.apache.org/job/PreCommit-HDFS-Build/14805/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_95.txt
|
| JDK v1.7.0_95  Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/14805/testReport/
|
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs |
| Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/14805/console |
| Powered by | Apache Yetus 0.2.0   http://yetus.apache.org |


This message was automatically generated.



> TestDecommissioningStatus fails intermittently in trunk
> -------------------------------------------------------
>
>                 Key: HDFS-9950
>                 URL: https://issues.apache.org/jira/browse/HDFS-9950
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: test
>            Reporter: Lin Yiqun
>            Assignee: Lin Yiqun
>         Attachments: HDFS-9950.001.patch
>
>
> I often found that the testcase {{TestDecommissioningStatus}} failed sometimes. And I
looked the test failed report, it always show these error infos:
> {code}
> testDecommissionStatus(org.apache.hadoop.hdfs.server.namenode.TestDecommissioningStatus)
 Time elapsed: 0.462 sec  <<< FAILURE!
> java.lang.AssertionError: Unexpected num under-replicated blocks expected:<3> but
was:<4>
> 	at org.junit.Assert.fail(Assert.java:88)
> 	at org.junit.Assert.failNotEquals(Assert.java:743)
> 	at org.junit.Assert.assertEquals(Assert.java:118)
> 	at org.junit.Assert.assertEquals(Assert.java:555)
> 	at org.apache.hadoop.hdfs.server.namenode.TestDecommissioningStatus.checkDecommissionStatus(TestDecommissioningStatus.java:196)
> 	at org.apache.hadoop.hdfs.server.namenode.TestDecommissioningStatus.testDecommissionStatus(TestDecommissioningStatus.java:291)
> {code}
> And I know the reason is that the under-replicated num is not correct in method checkDecommissionStatus
of {{TestDecommissioningStatus#testDecommissionStatus}}. 
> In this testcase, each datanode should have 4 blocks(2 for decommission.dat, 2 for decommission.dat1)The
expect num 3 on first node is because the lastBlock of  uc blockCollection can not be replicated
if its numlive just more than blockManager minReplication(in this case is 1). And before decommed
second datanode, it has already one live replication for the uc blockCollection' lastBlock
in this node. 
> So in this failed case, the first node's under-replicat changes to 4 indicated that the
uc blockCollection lastBlock's livenum is already 0 before the second datanode decommed. So
I think there are two possibilitys will lead to it. 
> * The second datanode was already decommed before node one.
> * Creating file decommission.dat1 failed that lead that the second datanode has no this
block.
> And I read the code, it has checked the decommission-in-progress nodes here
> {code}
> if (iteration == 0) {
>         assertEquals(decommissioningNodes.size(), 1);
>         DatanodeDescriptor decommNode = decommissioningNodes.get(0);
>         checkDecommissionStatus(decommNode, 3, 0, 1);
>         checkDFSAdminDecommissionStatus(decommissioningNodes.subList(0, 1),
>             fileSys, admin);
>       }
> {code}
> So it seems the second possibility are more likely the reason. And in addition, it hasn't
did a block number check when finished the creating file. So we could do a check and retry
operatons here if block number is not correct as expected.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message