hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Weiwei Yang (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (HDFS-12440) Ozone: TestAllocateContainer fails on jenkins
Date Wed, 13 Sep 2017 07:14:00 GMT

    [ https://issues.apache.org/jira/browse/HDFS-12440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16164214#comment-16164214
] 

Weiwei Yang edited comment on HDFS-12440 at 9/13/17 7:13 AM:
-------------------------------------------------------------

It looks like all DNs were registered

{noformat}
2017-09-12 12:08:02,726 [main] INFO  ozone.MiniOzoneCluster (MiniOzoneCluster.java:lambda$waitOzoneReady$0(259))
     - Waiting for cluster to be ready. Got 0 of 3 DN Heartbeats.
2017-09-12 12:08:03,726 [main] INFO  ozone.MiniOzoneCluster (MiniOzoneCluster.java:lambda$waitOzoneReady$0(259))
     - Waiting for cluster to be ready. Got 0 of 3 DN Heartbeats.
2017-09-12 12:08:04,326 [IPC Server handler 18 on 37181] INFO  node.SCMNodeManager (SCMNodeManager.java:register(745))
     - Data node with ID: c5906234-d717-45d9-bbe8-972bc4dad260 Registered.
2017-09-12 12:08:04,327 [IPC Server handler 10 on 37181] INFO  node.SCMNodeManager (SCMNodeManager.java:register(745))
     - Data node with ID: 2541b3ac-d953-4650-8214-26aa6fd8601e Registered.
2017-09-12 12:08:04,335 [IPC Server handler 11 on 37181] INFO  node.SCMNodeManager (SCMNodeManager.java:register(745))
     - Data node with ID: 5dcadc32-d543-4044-9b58-892eeb6880bf Registered.
2017-09-12 12:08:04,727 [main] INFO  ozone.MiniOzoneCluster (MiniOzoneCluster.java:lambda$waitOzoneReady$0(259))
     - Cluster is ready. Got 3 of 3 DN Heartbeats.
{noformat}

however SCM node manager seems not properly initiated

{noformat}
org.apache.hadoop.ozone.protocol.StorageContainerLocationProtocol.allocateContainer from 172.17.0.2:53783
java.lang.NullPointerException
	at org.apache.hadoop.ozone.scm.node.SCMNodeManager.getNodeStat(SCMNodeManager.java:828)
	at org.apache.hadoop.ozone.scm.container.placement.algorithms.SCMCommonPolicy.hasEnoughSpace(SCMCommonPolicy.java:147)
	at org.apache.hadoop.ozone.scm.container.placement.algorithms.SCMCommonPolicy.lambda$chooseDatanodes$0(SCMCommonPolicy.java:125)
{noformat}


was (Author: cheersyang):
It looks like the UT was failing because dfs test dir gets conflicts, I saw following error
message in the log

{noformat}
[main] INFO  ozone.MiniOzoneCluster (MiniOzoneCluster.java:setConf(125))      - dn2: set dfs.container.ratis.datanode.storage.dir
= /testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/3/dfs/data/dn2_data-1
...
[main] INFO  hdfs.MiniDFSCluster (MiniDFSCluster.java:startDataNodes(1596)) - Starting DataNode
2 with dfs.datanode.data.dir: [DISK]file:/testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/3/dfs/data/dn2_data0,[DISK]file:/testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/3/dfs/data/dn2_data1
{noformat}

then ...

{noformat}
2017-09-11 10:54:32,297 [Thread-176] INFO  common.Storage (Storage.java:lock(813)) - Cannot
lock storage /testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/3/dfs/data/dn0_data1.
The directory is already locked
2017-09-11 10:54:32,301 [Thread-176] WARN  common.Storage (DataStorage.java:loadDataStorage(410))
- Failed to add storage directory [DISK]file:/testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/3/dfs/data/dn0_data1
java.io.IOException: Cannot lock storage /testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/3/dfs/data/dn0_data1.
The directory is already locked
	at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.lock(Storage.java:814)
	at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.analyzeStorage(Storage.java:622)
	at org.apache.hadoop.hdfs.server.datanode.DataStorage.loadStorageDirectory(DataStorage.java:262)
	at org.apache.hadoop.hdfs.server.datanode.DataStorage.loadDataStorage(DataStorage.java:399)
	at org.apache.hadoop.hdfs.server.datanode.DataStorage.addStorageLocations(DataStorage.java:379)
	at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:544)
	at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1731)
	at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1691)
	at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:376)
	at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:280)
	at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:816)
	at java.lang.Thread.run(Thread.java:748)
{noformat}

> Ozone: TestAllocateContainer fails on jenkins
> ---------------------------------------------
>
>                 Key: HDFS-12440
>                 URL: https://issues.apache.org/jira/browse/HDFS-12440
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: ozone
>    Affects Versions: HDFS-7240
>            Reporter: Weiwei Yang
>            Assignee: Weiwei Yang
>            Priority: Minor
>
> I am seeing this failure in [this jenkins report|https://builds.apache.org/job/PreCommit-HDFS-Build/21089/testReport/org.apache.hadoop.ozone.scm/TestAllocateContainer/testAllocate/],
with following error
> {noformat}
> Stacktrace
> java.lang.NullPointerException
>  at org.apache.hadoop.ozone.scm.node.SCMNodeManager.getNodeStat(SCMNodeManager.java:828)
>  at org.apache.hadoop.ozone.scm.container.placement.algorithms.SCMCommonPolicy.hasEnoughSpace(SCMCommonPolicy.java:147)
>  at org.apache.hadoop.ozone.scm.container.placement.algorithms.SCMCommonPolicy.lambda$chooseDatanodes$0(SCMCommonPolicy.java:125)
>  at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:174)
>  at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1374)
>  at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
>  at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
>  at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
>  at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
>  at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499)
>  at 
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message