hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hadoop QA (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-9619) DataNode sometimes can not find blockpool for the correct namenode
Date Wed, 06 Jan 2016 20:29:40 GMT

    [ https://issues.apache.org/jira/browse/HDFS-9619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15086225#comment-15086225
] 

Hadoop QA commented on HDFS-9619:
---------------------------------

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue}
Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green}
The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color}
| {color:green} The patch appears to include 1 new or modified test files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 33s {color}
| {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 39s {color} |
{color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 43s {color} |
{color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 15s {color}
| {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 52s {color} |
{color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 14s {color}
| {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 53s {color} |
{color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 6s {color} | {color:green}
trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 46s {color} |
{color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 46s {color}
| {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 36s {color} |
{color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 36s {color} | {color:green}
the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 39s {color} |
{color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 39s {color} | {color:green}
the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 15s {color}
| {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 49s {color} |
{color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 11s {color}
| {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color}
| {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 2s {color} |
{color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 5s {color} | {color:green}
the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 43s {color} |
{color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 53m 32s {color} | {color:red}
hadoop-hdfs in the patch failed with JDK v1.8.0_66. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 50m 30s {color} | {color:green}
hadoop-hdfs in the patch passed with JDK v1.7.0_91. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 20s {color}
| {color:green} Patch does not generate ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 130m 1s {color} | {color:black}
{color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_66 Failed junit tests | hadoop.hdfs.server.datanode.TestBlockReplacement |
|   | hadoop.hdfs.TestBlockStoragePolicy |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:0ca8df7 |
| JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12780801/HDFS-9619.001.patch
|
| JIRA Issue | HDFS-9619 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  unit  findbugs
 checkstyle  |
| uname | Linux 30744bddc127 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12
UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / b993668 |
| Default Java | 1.7.0_91 |
| Multi-JDK versions |  /usr/lib/jvm/java-8-oracle:1.8.0_66 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_91
|
| findbugs | v3.0.0 |
| unit | https://builds.apache.org/job/PreCommit-HDFS-Build/14045/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_66.txt
|
| unit test logs |  https://builds.apache.org/job/PreCommit-HDFS-Build/14045/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_66.txt
|
| JDK v1.7.0_91  Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/14045/testReport/
|
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs |
| Max memory used | 75MB |
| Powered by | Apache Yetus 0.2.0-SNAPSHOT   http://yetus.apache.org |
| Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/14045/console |


This message was automatically generated.



> DataNode sometimes can not find blockpool for the correct namenode
> ------------------------------------------------------------------
>
>                 Key: HDFS-9619
>                 URL: https://issues.apache.org/jira/browse/HDFS-9619
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: datanode, test
>    Affects Versions: 3.0.0
>         Environment: Jenkins
>            Reporter: Wei-Chiu Chuang
>            Assignee: Wei-Chiu Chuang
>              Labels: test
>         Attachments: HDFS-9619.001.patch, HDFS-9619.002.patch
>
>
> We sometimes see {{TestBalancerWithMultipleNameNodes.testBalancer}} failed to replicate
a file, because a data node is excluded.
> {noformat}
> File /tmp.txt could only be replicated to 0 nodes instead of minReplication (=1).  There
are 1 datanode(s) running and 1 node(s) are excluded in this operation.
>  at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1745)
>  at org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.chooseTargetForNewBlock(FSDirWriteFileOp.java:299)
>  at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2390)
>  at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:797)
>  at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:500)
>  at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>  at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:637)
>  at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:976)
>  at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2305)
>  at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2301)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:415)
>  at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1705)
>  at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2299)
> {noformat}
> Relevent logs suggest root cause is due to block pool not found.  
> {noformat}
> 2016-01-03 22:11:43,174 [DataXceiver for client DFSClient_NONMAPREDUCE_849671738_1 at
/127.0.0.1:47318 [Receiving block BP-1927700312-172.26.2.1-1451887902222:blk_1073741825_1001]]
ERROR datanode.DataNode (DataXceiver.java:run(280)) - host0.foo.com:49997:DataXceiver error
processing WRITE_BLOCK operation src: /127.0.0.1:47318 dst: /127.0.0.1:49997
> java.io.IOException: Non existent blockpool BP-1927700312-172.26.2.1-1451887902222
> at org.apache.hadoop.hdfs.server.datanode.SimulatedFSDataset.getMap(SimulatedFSDataset.java:583)
> at org.apache.hadoop.hdfs.server.datanode.SimulatedFSDataset.createTemporary(SimulatedFSDataset.java:955)
> at org.apache.hadoop.hdfs.server.datanode.SimulatedFSDataset.createRbw(SimulatedFSDataset.java:941)
> at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.<init>(BlockReceiver.java:203)
> at org.apache.hadoop.hdfs.server.datanode.DataXceiver.getBlockReceiver(DataXceiver.java:1235)
> at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:678)
> at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:166)
> at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:103)
> at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:253)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> For a bit more context, this test starts a cluster with two name nodes and one data node.
The block pools are added, but one of them is not found after added. The root cause is due
to an undetected concurrent access in a hash map in SimulatedFSDataset (two block pools are
added simultaneously). I added some logs to print blockMap, and saw a few ConcurrentModificationExceptions.
The solution would be to use a thread safe class instead, like ConcurrentHashMap.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message