hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Nauroth (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-7722) DataNode#checkDiskError should also remove Storage when error is found.
Date Mon, 09 Mar 2015 21:07:38 GMT

    [ https://issues.apache.org/jira/browse/HDFS-7722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14353605#comment-14353605

Chris Nauroth commented on HDFS-7722:

Eddy, it looks good.  I have just one minor nit.  In {{TestDataNodeVolumeFailureReporting}},
please remove the commented out lines of test code for the final version of the patch.  Also,
we can no longer remove the import of {{org.apache.hadoop.hdfs.protocol.Block}}, because another
patch started using it recently.

bq. I suggest to have a following JIRA...

Please feel free to do that if you wish, but I actually don't think it's necessary.  In general,
I don't expect permanent removal of a volume to be the typical recovery procedure.  Instead,
I expect a more typical recovery procedure to be like you described: replace the faulty disk.
 Since that works fine, I think it would be overkill at this point to put in dedicated functionality
to cover something that is probably a very rare edge case in practical deployments.

Thanks for working on this!

> DataNode#checkDiskError should also remove Storage when error is found.
> -----------------------------------------------------------------------
>                 Key: HDFS-7722
>                 URL: https://issues.apache.org/jira/browse/HDFS-7722
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: datanode
>    Affects Versions: 2.6.0
>            Reporter: Lei (Eddy) Xu
>            Assignee: Lei (Eddy) Xu
>         Attachments: HDFS-7722.000.patch, HDFS-7722.001.patch, HDFS-7722.002.patch
> When {{DataNode#checkDiskError}} found disk errors, it removes all block metadatas from
{{FsDatasetImpl}}. However, it does not removed the corresponding {{DataStorage}} and {{BlockPoolSliceStorage}}.

> The result is that, we could not directly run {{reconfig}} to hot swap the failure disks
without changing the configure file.

This message was sent by Atlassian JIRA

View raw message