hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Xiao Chen (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-14021) TestReconstructStripedBlocksWithRackAwareness#testReconstructForNotEnoughRacks fails intermittently
Date Tue, 23 Oct 2018 21:26:00 GMT

    [ https://issues.apache.org/jira/browse/HDFS-14021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16661307#comment-16661307
] 

Xiao Chen commented on HDFS-14021:
----------------------------------

Attached a sample failure report, and a patch to fix.

 
 {noformat}
2018-10-15 23:02:12,834 [Block report processor] DEBUG blockmanagement.BlockManager (BlockManager.java:processAndHandleReportedBlock(3839))
- In memory blockUCState = UNDER_CONSTRUCTION
2018-10-15 23:02:12,836 [Block report processor] DEBUG BlockStateChange (BlockManager.java:addStoredBlock(3148))
- BLOCK* addStoredBlock: 127.0.0.1:38427 is added to blk_-9223372036854775792_1001 (size=0)
2018-10-15 23:02:12,837 [Block report processor] DEBUG BlockStateChange (BlockManager.java:processIncrementalBlockReport(3949))
- BLOCK* block RECEIVED_BLOCK: blk_-9223372036854775785_1001 is received from 127.0.0.1:38427
2018-10-15 23:02:12,837 [Block report processor] DEBUG BlockStateChange (BlockManager.java:processIncrementalBlockReport(3952))
- *BLOCK* NameNode.processIncrementalBlockReport: from 127.0.0.1:38427 receiving: 0, received:
1, deleted: 0
---> 2018-10-15 23:02:12,840 [IPC Server handler 7 on 35885] DEBUG BlockStateChange (LowRedundancyBlocks.java:add(293))
- BLOCK* NameSystem.LowRedundancyBlock.add: blk_-9223372036854775792_1001 has only 8 replicas
and need 9 replicas so is added to neededReconstructions at priority level 2
2018-10-15 23:02:12,840 [IPC Server handler 7 on 35885] INFO hdfs.StateChange (FSNamesystem.java:completeFile(2830))
- DIR* completeFile: /foo is closed by DFSClient_NONMAPREDUCE_-442030319_1
2018-10-15 23:02:12,841 [Block report processor] DEBUG blockmanagement.BlockManager (BlockManager.java:processAndHandleReportedBlock(3816))
- Reported block blk_-9223372036854775784_1001 on 127.0.0.1:44904 size 2097152 replicaState
= FINALIZED
2018-10-15 23:02:12,841 [Block report processor] DEBUG blockmanagement.BlockManager (BlockManager.java:processAndHandleReportedBlock(3839))
- In memory blockUCState = COMPLETE
2018-10-15 23:02:12,841 [Block report processor] DEBUG BlockStateChange (BlockManager.java:addStoredBlock(3148))
- BLOCK* addStoredBlock: 127.0.0.1:44904 is added to blk_-9223372036854775792_1001 (size=12582912)
2018-10-15 23:02:12,841 [main] INFO hdfs.MiniDFSCluster (MiniDFSCluster.java:shutdown(1965))
- Shutting down the Mini HDFS Cluster
---> 2018-10-15 23:02:12,842 [Block report processor] DEBUG BlockStateChange ---(LowRedundancyBlocks.java:remove(387))
- BLOCK* NameSystem.LowRedundancyBlock.remove: Removing block blk_-9223372036854775792_1001
from priority queue 2
2018-10-15 23:02:12,842 [main] INFO hdfs.MiniDFSCluster (MiniDFSCluster.java:shutdownDataNode(2013))
- Shutting down DataNode 8
2018-10-15 23:02:12,842 [Block report processor] DEBUG BlockStateChange (BlockManager.java:processIncrementalBlockReport(3949))
- BLOCK* block RECEIVED_BLOCK: blk_-9223372036854775784_1001 is received from 127.0.0.1:44904
2018-10-15 23:02:12,844 [Block report processor] DEBUG BlockStateChange (BlockManager.java:processIncrementalBlockReport(3952))
- *BLOCK* NameNode.processIncrementalBlockReport: from 127.0.0.1:44904 receiving: 0, received:
1, deleted: 0
2018-10-15 23:02:12,843 [org.apache.hadoop.hdfs.server.datanode.DataXceiverServer@62e7dffa]
INFO datanode.DataNode (DataXceiverServer.java:closeAllPeers(281)) - Closing all peers.
2018-10-15 23:02:12,843 [main] WARN datanode.DirectoryScanner (DirectoryScanner.java:shutdown(340))
- DirectoryScanner: shutdown has been called

 {noformat}

It appears to be a race condition between the block reports and the test's check to {{numOfUnderReplicatedBlocks}}.


> TestReconstructStripedBlocksWithRackAwareness#testReconstructForNotEnoughRacks fails
intermittently
> ---------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-14021
>                 URL: https://issues.apache.org/jira/browse/HDFS-14021
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: erasure-coding, test
>    Affects Versions: 3.0.0
>            Reporter: Xiao Chen
>            Assignee: Xiao Chen
>            Priority: Major
>         Attachments: HDFS-14021.01.patch, TEST-org.apache.hadoop.hdfs.server.blockmanagement.TestReconstructStripedBlocksWithRackAwareness.xml
>
>
> The test sometimes fail with:
> {noformat}
> java.lang.AssertionError: expected:<0> but was:<1>
> 	
> at org.apache.hadoop.hdfs.server.blockmanagement.TestReconstructStripedBlocksWithRackAwarness.testReconstructForNotEnoughRacks(TestReconstructStripedBlocksWithRackAwareness.java:171)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message