hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wei-Chiu Chuang (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HDFS-11155) VolumeScanner should report the latest generation stamp of a bad replica
Date Fri, 18 Nov 2016 05:51:58 GMT
Wei-Chiu Chuang created HDFS-11155:
--------------------------------------

             Summary: VolumeScanner should report the latest generation stamp of a bad replica
                 Key: HDFS-11155
                 URL: https://issues.apache.org/jira/browse/HDFS-11155
             Project: Hadoop HDFS
          Issue Type: Bug
          Components: datanode
    Affects Versions: 2.7.4
         Environment: CDH5.7.2
            Reporter: Wei-Chiu Chuang
            Assignee: Wei-Chiu Chuang


HDFS-10512 fixed a race condition that caused VolumeScanner to terminate abruptly when a corrupt
replica is detected. However, when a corrupt replica is detected, VolumeScanner still reports
the old replica generation stamp to the NN. NN then directs DN to remove the older replica,
but because the generation stamp is updated, DN can not find it, so corrupt replica remains
corrupt.

NN's log shows something similar to the following:
{quote}
2016-11-17 21:08:05,350 INFO BlockStateChange: BLOCK NameSystem.addToCorruptReplicasMap: blk_1077571736
added as corrupt on 192.168.168.58:50010 by /192.168.168.58  because client machine reported
it
2016-11-17 21:08:05,350 INFO BlockStateChange: BLOCK* invalidateBlock: blk_1077571736_3991953(stored=blk_1077571736_3992018)
on 192.168.168.58:50010
{quote}
The DN's log has these:

{noformat}
2016-11-17 21:08:04,815 INFO org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl:
Appending to FinalizedReplica, blk_1077571736_3991953, FINALIZED
  getNumBytes()     = 39061752
  getBytesOnDisk()  = 39061752
  getVisibleLength()= 39061752
  getVolume()       = /data/3/dfs/dn/current
  getBlockFile()    = /data/3/dfs/dn/current/BP-1092022411-192.168.168.55-1474407949037/current/finalized/subdir58/subdir112/blk_1077571736

2016-11-17 21:08:09,158 INFO org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl:
Failed to delete replica blk_1077571736_3991953: ReplicaInfo not found.
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message