hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Javier Maestro (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-8533) Mismatch in displaying the "MissingBlock" count in fsck and in other metric reports
Date Wed, 06 Apr 2016 22:01:25 GMT

    [ https://issues.apache.org/jira/browse/HDFS-8533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15229221#comment-15229221

Javier Maestro commented on HDFS-8533:

I've seen this issue as well, I'm running with a replication factor of 3 and I
get alerts because the metrics in JMX report missing blocks but fsck doesn't
see any. Here's some relevant output from {{fsck}} and {{JMX}} (redacted for
privacy reasons):

{code:title=hdfs fsck /}
................................................................Status: HEALTHY
 Minimally replicated blocks:   XXXX (99.99999 %)
 Over-replicated blocks:        0 (0.0 %)
 Under-replicated blocks:       XXXX (0.57755035 %)
 Mis-replicated blocks:         XXXX (0.1345275 %)
 Default replication factor:    3
 Average block replication:     2.9942245
 Corrupt blocks:                0
 Missing replicas:              XXXX (0.19251679 %)
 Number of data-nodes:          XXXX
 Number of racks:               X

{code:title=JMX for Hadoop:service=NameNode,name=FSNamesystem}
"name": "Hadoop:service=NameNode,name=FSNamesystem",
"CorruptBlocks": 9,
"MissingBlocks": 9,

{code:title=JMX for Hadoop:service=NameNode,name=NameNodeInfo}
"name": "Hadoop:service=NameNode,name=NameNodeInfo"
"NumberOfMissingBlocks": 9,

h2. TL;DR;

h3. FSCK
- {{fsck}}: goes over all of the blocks, one by one and:
    - counts blocks as {{corrupt}} by querying {{LocatedBlock}}'s {{isCorrupt()}}
    - count blocks as {{missing}} if {{totalReplicasPerBlock == 0 && !isCorrupt}}

- {{fsck}} reports HEALTHY if there are no missing or corrupt blocks ({{L1078}})

h3. JMX

h4. CorruptBlocks
The internal metrics use {{CorruptReplicasMap}} which is ultimately updated via
{{FSNamesystem}}'s {{reportBadBlocks()}}. This means that the client is
reporting some bad block locations.

Thus, the mismatch is between what triggers the client to call
{{reportBadBlocks()}} and {{LocatedBlock}}'s {{isCorrupt()}} for those blocks.

Looking at who calls {{FSNamesystem}}'s {{reportBadBlocks()}}:

 895   /**
 896    * The client has detected an error on the specified located blocks.
 897    * and is reporting them to the server.  For now, the namenode will.
 898    * mark the block as corrupt.  In the future we might.
 899    * check the blocks are actually corrupt..
 900    */
 901   @Override // ClientProtocol, DatanodeProtocol
 902   public void reportBadBlocks(LocatedBlock[] blocks) throws IOException {
 903     checkNNStartup();
 904     namesystem.reportBadBlocks(blocks);
 905   }

So, unlike {{fsck}}, there is no check in there to filter blocks that are
reported as being bad by the datanodes but that are actually not corrupt if you
check {{LocatedBlock}}'s {{isCorrupt()}} after they've been received.

h4. MissingBlocks
The outcome here is the same as {{CorruptBlocks}}, because the update to
{{neededReconstruction}} ({{LowRedundancyBlocks}})also happens within
{{markBlockAsCorrupt}}, by checking {{isPopulatingReplQueues()}}.

The internal metrics use this.neededReconstruction.getCorruptBnlockSize();

This is pretty broken if we compare it to the approach in {{fsck}} because we
are counting corrupt blocks as missing while in {{fsck}}, only those with no
replicas are accounted.

This also helps understand why the {{CorruptBlocks}} always matches
{{MissingBlocks}} in JMX.

h3. Conclusion

What I'd do is:

- filter the bad blocks on reception, within {{NameNodeRpcServer.java}}'s
  {{reportBadBlocks}} and only mark as bad those that pass {{LocatedBlock}}'s
  {{isCorrupt()}}, like {{fsck}} already does.

- Wishlist: keep track of the number of mis-reported blocks in the cell and/or
  which datanodes mis-report blocks as being bad, since this will probably help
  find out bugs and/or bad datanodes / hardware, etc.

- make sure {{MissingBlocks}} are accounted in a similar fashion between
  {{BlockManager}} (JMX) and {{NamenodeFsck}} ({{fsck}}). For the record, I
  think {{fsck}} does The Right Thing (TM): count as {{missing}} only the
  blocks that are completely unavailable *and* that are *not* corrupt.

Finally, it'll be interesting to find out what triggers the Datanode to report
bad blocks that are actually not bad. I'll dig deeper and see if there's
something broken somewhere in my cluster :D

> Mismatch in displaying the "MissingBlock" count in fsck and in other metric reports
> -----------------------------------------------------------------------------------
>                 Key: HDFS-8533
>                 URL: https://issues.apache.org/jira/browse/HDFS-8533
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: J.Andreina
>            Assignee: J.Andreina
>            Priority: Critical
> Number of DN = 2
> Step 1: Write a file with replication factor - 3 .
> Step 2: Corrupt a replica in DN1
> Step 3: DN2 is down. 
> Missing Block count in  report is as follows
> Fsck report                                    : *0*
> Jmx, "dfsadmin -report" , UI, logs : *1*
> In fsck , only block whose replicas are all missed and not been corrupted are counted

> {code}
> if (totalReplicasPerBlock == 0 && !isCorrupt) {
>         // If the block is corrupted, it means all its available replicas are
>         // corrupted. We don't mark it as missing given these available replicas
>         // might still be accessible as the block might be incorrectly marked as
>         // corrupted by client machines.
> {code}
> While in other reports even if all the replicas are corrupted , block is been considered
as missed.
> Please provide your thoughts : can we make missing block count consistent across all
the reports same as implemented for fsck?

This message was sent by Atlassian JIRA

View raw message