Mark Ormesher created HDFS-15103:
------------------------------------
Summary: JMX endpoint and "dfsadmin" report 1 corrupt block; "fsck" reports 0
Key: HDFS-15103
URL: https://issues.apache.org/jira/browse/HDFS-15103
Project: Hadoop HDFS
Issue Type: Bug
Components: namenode
Affects Versions: 3.2.1
Environment: * CentOS 7
* HDFS 3.2.1
* 2x HA NNs
* 5x identical DNs
Reporter: Mark Ormesher
We're seeing a long-running discrepancy between the number of corrupted blocks reported by
the JMX endpoint and {{dfsadmin -report}} (1) and by {{fsck /}} (0). This has persisted through
rolling restarts of the NNs and DNs, and through complete shutdowns for the HDFS cluster for
unrelated maintenance.
{panel:title=JMX endpoint snippet}
{code}
(...)
"CorruptBlocks" : 1,
"ScheduledReplicationBlocks" : 0,
"PendingDeletionBlocks" : 0,
"LowRedundancyReplicatedBlocks" : 0,
"CorruptReplicatedBlocks" : 1,
"MissingReplicatedBlocks" : 0,
"MissingReplicationOneBlocks" : 0,
(...)
{code}
{panel}
{panel:title=dfsadmin -report}
{code}
$ ./hdfs dfsadmin -report | grep -i corrupt
Blocks with corrupt replicas: 1
Block groups with corrupt internal blocks: 0
{code}
{panel}
{panel:title=fsck /}
{code}
$ ./hdfs fsck / -files -blocks | grep -i corrupt
Corrupt blocks: 0
Corrupt block groups: 0
{code}
{panel}
I've read through the related tickets below, all of which suggest this issue was resolved
in 2.7.8, but we're seeing it in 3.2.1.
https://issues.apache.org/jira/browse/HDFS-8533
https://issues.apache.org/jira/browse/HDFS-10213
https://issues.apache.org/jira/browse/HDFS-13999
How can we work out whether we really do have a corrupt block, and if we do how can we work
out which block it is if {{fsck}} thinks everything is fine?
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-help@hadoop.apache.org
|