hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Shinichi Yamashita (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-6833) DirectoryScanner should not register a deleting block with memory of DataNode
Date Wed, 10 Dec 2014 16:18:14 GMT

    [ https://issues.apache.org/jira/browse/HDFS-6833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14241316#comment-14241316
] 

Shinichi Yamashita commented on HDFS-6833:
------------------------------------------

Hi [~yzhangal],

Thank you for your many comments! I had attached a patch file for branch 2.6.
I attach a patch file for trunk that reflected your comments.
I have a old version of Hadoop cluster. So I will prepare trunk version of Hadoop cluster
and comfirm it.

BTW, *private boolean scanning* in DirectoryScanner of previous patch was to prevent false
detection in DirectoryScanner#scan.
In other words, the current patch are concerned about the following, I think.

{code}
  void scan() {
    clear();
    Map<String, ScanInfo[]> diskReport = getDiskReport();  (1) DirectoryScanner comfirms
block file and meta file

    (2) In FsDatasetAsyncDiskService, block file delete and remove deleteBlock 

    synchronized(dataset) {
      (3) dataset.isDeletingBlock is false and set addDifference() -> Incorrect.
    ...
{code}

I think I might need additional control on DirectoryScanner#getDiskReport or additional lock
on DirectoryScanner#scan.

> DirectoryScanner should not register a deleting block with memory of DataNode
> -----------------------------------------------------------------------------
>
>                 Key: HDFS-6833
>                 URL: https://issues.apache.org/jira/browse/HDFS-6833
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: datanode
>    Affects Versions: 3.0.0, 2.5.0, 2.5.1
>            Reporter: Shinichi Yamashita
>            Assignee: Shinichi Yamashita
>            Priority: Critical
>         Attachments: HDFS-6833-10.patch, HDFS-6833-11.patch, HDFS-6833-6-2.patch, HDFS-6833-6-3.patch,
HDFS-6833-6.patch, HDFS-6833-7-2.patch, HDFS-6833-7.patch, HDFS-6833.8.patch, HDFS-6833.9.patch,
HDFS-6833.patch, HDFS-6833.patch, HDFS-6833.patch, HDFS-6833.patch, HDFS-6833.patch
>
>
> When a block is deleted in DataNode, the following messages are usually output.
> {code}
> 2014-08-07 17:53:11,606 INFO org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService:
Scheduling blk_1073741825_1001 file /hadoop/data1/dfs/data/current/BP-1887080305-172.28.0.101-1407398838872/current/finalized/subdir0/subdir0/blk_1073741825
for deletion
> 2014-08-07 17:53:11,617 INFO org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService:
Deleted BP-1887080305-172.28.0.101-1407398838872 blk_1073741825_1001 file /hadoop/data1/dfs/data/current/BP-1887080305-172.28.0.101-1407398838872/current/finalized/subdir0/subdir0/blk_1073741825
> {code}
> However, DirectoryScanner may be executed when DataNode deletes the block in the current
implementation. And the following messsages are output.
> {code}
> 2014-08-07 17:53:30,519 INFO org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService:
Scheduling blk_1073741825_1001 file /hadoop/data1/dfs/data/current/BP-1887080305-172.28.0.101-1407398838872/current/finalized/subdir0/subdir0/blk_1073741825
for deletion
> 2014-08-07 17:53:31,426 INFO org.apache.hadoop.hdfs.server.datanode.DirectoryScanner:
BlockPool BP-1887080305-172.28.0.101-1407398838872 Total blocks: 1, missing metadata files:0,
missing block files:0, missing blocks in memory:1, mismatched blocks:0
> 2014-08-07 17:53:31,426 WARN org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl:
Added missing block to memory FinalizedReplica, blk_1073741825_1001, FINALIZED
>   getNumBytes()     = 21230663
>   getBytesOnDisk()  = 21230663
>   getVisibleLength()= 21230663
>   getVolume()       = /hadoop/data1/dfs/data/current
>   getBlockFile()    = /hadoop/data1/dfs/data/current/BP-1887080305-172.28.0.101-1407398838872/current/finalized/subdir0/subdir0/blk_1073741825
>   unlinked          =false
> 2014-08-07 17:53:31,531 INFO org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService:
Deleted BP-1887080305-172.28.0.101-1407398838872 blk_1073741825_1001 file /hadoop/data1/dfs/data/current/BP-1887080305-172.28.0.101-1407398838872/current/finalized/subdir0/subdir0/blk_1073741825
> {code}
> Deleting block information is registered in DataNode's memory.
> And when DataNode sends a block report, NameNode receives wrong block information.
> For example, when we execute recommission or change the number of replication, NameNode
may delete the right block as "ExcessReplicate" by this problem.
> And "Under-Replicated Blocks" and "Missing Blocks" occur.
> When DataNode run DirectoryScanner, DataNode should not register a deleting block.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message