hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yongjun Zhang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-6833) DirectoryScanner should not register a deleting block with memory of DataNode
Date Wed, 17 Dec 2014 23:53:14 GMT

    [ https://issues.apache.org/jira/browse/HDFS-6833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14250839#comment-14250839
] 

Yongjun Zhang commented on HDFS-6833:
-------------------------------------

Hi [~sinchii],

Thanks for your new rev and sorry for late response. It looks good to me except for two minor
things that you can take care of after getting committer's review:

*  {{public int getNumDeletingBlocks(String bpid)}} is not used anywhere. Consider removing.
It might be possible that we need to have such an util in the future, if so, the method need
to be implemented in the ReplicaMap protected with the internal mutex.

*   About {{if (m != null) {}} in {{void removeBlocks(String bpid, Set<Long> blockIds)}},
it's better to check if m is null and return if so right after getting m, instead of doing
the check again and again in the loop. Or you can put the loop within {{if (m != null) {...}.}}

HI [~cnauroth] and [~szetszwo], thanks for your earlier review. Wonder if any of you would
have time to take a look at the latest? thanks much.





> DirectoryScanner should not register a deleting block with memory of DataNode
> -----------------------------------------------------------------------------
>
>                 Key: HDFS-6833
>                 URL: https://issues.apache.org/jira/browse/HDFS-6833
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: datanode
>    Affects Versions: 3.0.0, 2.5.0, 2.5.1
>            Reporter: Shinichi Yamashita
>            Assignee: Shinichi Yamashita
>            Priority: Critical
>         Attachments: HDFS-6833-10.patch, HDFS-6833-11.patch, HDFS-6833-12.patch, HDFS-6833-6-2.patch,
HDFS-6833-6-3.patch, HDFS-6833-6.patch, HDFS-6833-7-2.patch, HDFS-6833-7.patch, HDFS-6833.8.patch,
HDFS-6833.9.patch, HDFS-6833.patch, HDFS-6833.patch, HDFS-6833.patch, HDFS-6833.patch, HDFS-6833.patch
>
>
> When a block is deleted in DataNode, the following messages are usually output.
> {code}
> 2014-08-07 17:53:11,606 INFO org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService:
Scheduling blk_1073741825_1001 file /hadoop/data1/dfs/data/current/BP-1887080305-172.28.0.101-1407398838872/current/finalized/subdir0/subdir0/blk_1073741825
for deletion
> 2014-08-07 17:53:11,617 INFO org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService:
Deleted BP-1887080305-172.28.0.101-1407398838872 blk_1073741825_1001 file /hadoop/data1/dfs/data/current/BP-1887080305-172.28.0.101-1407398838872/current/finalized/subdir0/subdir0/blk_1073741825
> {code}
> However, DirectoryScanner may be executed when DataNode deletes the block in the current
implementation. And the following messsages are output.
> {code}
> 2014-08-07 17:53:30,519 INFO org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService:
Scheduling blk_1073741825_1001 file /hadoop/data1/dfs/data/current/BP-1887080305-172.28.0.101-1407398838872/current/finalized/subdir0/subdir0/blk_1073741825
for deletion
> 2014-08-07 17:53:31,426 INFO org.apache.hadoop.hdfs.server.datanode.DirectoryScanner:
BlockPool BP-1887080305-172.28.0.101-1407398838872 Total blocks: 1, missing metadata files:0,
missing block files:0, missing blocks in memory:1, mismatched blocks:0
> 2014-08-07 17:53:31,426 WARN org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl:
Added missing block to memory FinalizedReplica, blk_1073741825_1001, FINALIZED
>   getNumBytes()     = 21230663
>   getBytesOnDisk()  = 21230663
>   getVisibleLength()= 21230663
>   getVolume()       = /hadoop/data1/dfs/data/current
>   getBlockFile()    = /hadoop/data1/dfs/data/current/BP-1887080305-172.28.0.101-1407398838872/current/finalized/subdir0/subdir0/blk_1073741825
>   unlinked          =false
> 2014-08-07 17:53:31,531 INFO org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService:
Deleted BP-1887080305-172.28.0.101-1407398838872 blk_1073741825_1001 file /hadoop/data1/dfs/data/current/BP-1887080305-172.28.0.101-1407398838872/current/finalized/subdir0/subdir0/blk_1073741825
> {code}
> Deleting block information is registered in DataNode's memory.
> And when DataNode sends a block report, NameNode receives wrong block information.
> For example, when we execute recommission or change the number of replication, NameNode
may delete the right block as "ExcessReplicate" by this problem.
> And "Under-Replicated Blocks" and "Missing Blocks" occur.
> When DataNode run DirectoryScanner, DataNode should not register a deleting block.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message