hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lin Yiqun (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-9819) FsVolume should tolerate few times check-dir failed due to deletion by mistake
Date Thu, 18 Feb 2016 05:46:18 GMT

    [ https://issues.apache.org/jira/browse/HDFS-9819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15151769#comment-15151769
] 

Lin Yiqun commented on HDFS-9819:
---------------------------------

Thanks [~vinayrpet]'s comments. HDFS-8845 make DiskChecker not check file recursively can
solve problems in my scenarios. But I want to know, should we let volume can tolerate few
times check-dir failed? Is there any other external reason will lead check-dir failed?

> FsVolume should tolerate few times check-dir failed due to deletion by mistake
> ------------------------------------------------------------------------------
>
>                 Key: HDFS-9819
>                 URL: https://issues.apache.org/jira/browse/HDFS-9819
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 2.7.1
>            Reporter: Lin Yiqun
>            Assignee: Lin Yiqun
>         Attachments: HDFS-9819.001.patch
>
>
> FsVolume should tolerate few times check-dir failed because sometimes we will do a delete
dir/file operation by mistake in datanode data-dirs. Then the {{DataNode#startCheckDiskErrorThread}}
will invoking checkDir method periodicity and find dir not existed, throw exception. The checked
volume will be added to failed volume list. The blocks on this volume will be replicated again.
But actually, this is not needed to do. We should let volume can be tolerated few times check-dir
failed like config {{dfs.datanode.failed.volumes.tolerated}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message