hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vinitha Reddy Gankidi (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-10301) BlockReport retransmissions may lead to storages falsely being declared zombie if storage report processing happens out of order
Date Fri, 16 Sep 2016 02:59:20 GMT

    [ https://issues.apache.org/jira/browse/HDFS-10301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15495211#comment-15495211
] 

Vinitha Reddy Gankidi commented on HDFS-10301:
----------------------------------------------

[~jingzhao] 
??Then can this cover DN hotswap case??
Yes, I will explain how it does below.

??For DN hotswap, I think the DN only sends FBR to notify NN about the change??
That is right.

During hotswap {{DataNode.reconfigurePropertyImpl()}} is invoked which identifies the newly
added/removed volumes. For all the volumes to be removed, {{FsDatasetImpl.removeVolumes()}}
is called. This also removes the block infos from the FsDataset. It does so by adding these
blocks to the {{blkToInvalidate}} map. Then the {{FsDatasetImpl.invalidate()}} method is invoked
for all the blocks in the map.
{code}
   * Invalidate a block but does not delete the actual on-disk block file.
   *
   * It should only be used when deactivating disks.
   *
   * @param bpid the block pool ID.
   * @param block The block to be invalidated.
   */
  public void invalidate(String bpid, ReplicaInfo block) {
    // If a DFSClient has the replica in its cache of short-circuit file
    // descriptors (and the client is using ShortCircuitShm), invalidate it.
    datanode.getShortCircuitRegistry().processBlockInvalidation(
        new ExtendedBlockId(block.getBlockId(), bpid));

    // If the block is cached, start uncaching it.
    cacheManager.uncacheBlock(bpid, block.getBlockId());

    datanode.notifyNamenodeDeletedBlock(new ExtendedBlock(bpid, block),
        block.getStorageUuid());
  }
{code}

As you can see, these blocks are reported to the NN as deleted. So, the NN eventually removes
all the blocks associated with this volume. Once this is done, the volume is actually pruned
by {{DatanodeDescriptor.pruneStorageMap()}} in the subsequent heartbeat.

> BlockReport retransmissions may lead to storages falsely being declared zombie if storage
report processing happens out of order
> --------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-10301
>                 URL: https://issues.apache.org/jira/browse/HDFS-10301
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode
>    Affects Versions: 2.6.1
>            Reporter: Konstantin Shvachko
>            Assignee: Vinitha Reddy Gankidi
>            Priority: Critical
>         Attachments: HDFS-10301.002.patch, HDFS-10301.003.patch, HDFS-10301.004.patch,
HDFS-10301.005.patch, HDFS-10301.006.patch, HDFS-10301.007.patch, HDFS-10301.008.patch, HDFS-10301.009.patch,
HDFS-10301.01.patch, HDFS-10301.010.patch, HDFS-10301.011.patch, HDFS-10301.012.patch, HDFS-10301.013.patch,
HDFS-10301.014.patch, HDFS-10301.branch-2.7.patch, HDFS-10301.branch-2.patch, HDFS-10301.sample.patch,
zombieStorageLogs.rtf
>
>
> When NameNode is busy a DataNode can timeout sending a block report. Then it sends the
block report again. Then NameNode while process these two reports at the same time can interleave
processing storages from different reports. This screws up the blockReportId field, which
makes NameNode think that some storages are zombie. Replicas from zombie storages are immediately
removed, causing missing blocks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message