hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yicong Cai (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HDFS-14311) multi-threading conflict at layoutVersion when loading block pool storage
Date Fri, 22 Feb 2019 07:35:00 GMT
Yicong Cai created HDFS-14311:
---------------------------------

             Summary: multi-threading conflict at layoutVersion when loading block pool storage
                 Key: HDFS-14311
                 URL: https://issues.apache.org/jira/browse/HDFS-14311
             Project: Hadoop HDFS
          Issue Type: Bug
          Components: rolling upgrades
    Affects Versions: 2.9.2
            Reporter: Yicong Cai


When DataNode upgrade from 2.7.3 to 2.9.2, there is a conflict at StorageInfo.layoutVersion
in loading block pool storage process.

It will cause this exception:

 
{panel:title=exceptions}
2019-02-15 10:18:01,357 [13783] - INFO [Thread-33:BlockPoolSliceStorage@395] - Restored 36974
block files from trash before the layout upgrade. These blocks will be moved to the previous
directory during the upgrade
2019-02-15 10:18:01,358 [13784] - WARN [Thread-33:BlockPoolSliceStorage@226] - Failed to analyze
storage directories for block pool BP-1216718839-10.120.232.23-1548736842023
java.io.IOException: Datanode state: LV = -57 CTime = 0 is newer than the namespace state:
LV = -63 CTime = 0
 at org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.doTransition(BlockPoolSliceStorage.java:406)
 at org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.loadStorageDirectory(BlockPoolSliceStorage.java:177)
 at org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.loadBpStorageDirectories(BlockPoolSliceStorage.java:221)
 at org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.recoverTransitionRead(BlockPoolSliceStorage.java:250)
 at org.apache.hadoop.hdfs.server.datanode.DataStorage.loadBlockPoolSliceStorage(DataStorage.java:460)
 at org.apache.hadoop.hdfs.server.datanode.DataStorage.addStorageLocations(DataStorage.java:390)
 at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:556)
 at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1649)
 at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1610)
 at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:388)
 at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:280)
 at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:816)
 at java.lang.Thread.run(Thread.java:748)
2019-02-15 10:18:01,358 [13784] - WARN [Thread-33:DataStorage@472] - Failed to add storage
directory [DISK]file:/mnt/dfs/2/hadoop/hdfs/data/ for block pool BP-1216718839-10.120.232.23-1548736842023
java.io.IOException: Datanode state: LV = -57 CTime = 0 is newer than the namespace state:
LV = -63 CTime = 0
 at org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.doTransition(BlockPoolSliceStorage.java:406)
 at org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.loadStorageDirectory(BlockPoolSliceStorage.java:177)
 at org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.loadBpStorageDirectories(BlockPoolSliceStorage.java:221)
 at org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.recoverTransitionRead(BlockPoolSliceStorage.java:250)
 at org.apache.hadoop.hdfs.server.datanode.DataStorage.loadBlockPoolSliceStorage(DataStorage.java:460)
 at org.apache.hadoop.hdfs.server.datanode.DataStorage.addStorageLocations(DataStorage.java:390)
 at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:556)
 at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1649)
 at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1610)
 at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:388)
 at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:280)
 at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:816)
 at java.lang.Thread.run(Thread.java:748) 
{panel}
 

root cause:

BlockPoolSliceStorage instance is shared for all storage locations recover transition. In BlockPoolSliceStorage.doTransition,
it will read the old layoutVersion from local storage, compare with current DataNode version,
then do upgrade. In doUpgrade, add the transition work as a sub-thread, the transition work
will set the BlockPoolSliceStorage's layoutVersion to current DN version. The next storage
dir transition check will concurrent with pre storage dir real transition work, then the BlockPoolSliceStorage
instance layoutVersion will confusion.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message