hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zhe Zhang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-8193) Add the ability to delay replica deletion for a period of time
Date Tue, 21 Apr 2015 17:37:59 GMT

    [ https://issues.apache.org/jira/browse/HDFS-8193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14505336#comment-14505336
] 

Zhe Zhang commented on HDFS-8193:
---------------------------------

Thanks Chris for bringing up the questions. 

bq. HDFS-6186 only applies at NameNode startup.  Is the new feature something that could be
triggered at any time on a running NameNode, such as right before a manual HA failover?
Short answer is yes. One can imagine it as a "trash" for block replicas, fully controlled
by the DN hosting them. This should shelter block replicas from most admin mis-operations
and NN bugs (more likely than DN bugs given the complexity) for a period of time. 

To answer the question from [~sureshms] under HDFS-6186:
bq. One problem with not deleting the blocks for a deleted file is, how does one restore it?
Can we address in this jira pausing deletion after startup and address the suggestion you
have made, along with other changes that might be necessary, in another jira.
First, NN bugs could cause block replicas to be deleted without deleting the file. Second,
it's rather easy to back up NN metadata before performing maintenance, but extremely difficult
to back up actual DN data. This JIRA aims to address that deficiency / discrepancy.

As future work, we plan to investigate an even more radical retention policy, where block
replicas are never deleted before DN is actually running out of space. At that moment, victims
are selected among pending-deletion replicas using a smart algorithm, and are overwritten
by incoming replicas. We'll file a separate JIRA for that, after this JIRA builds the basic
DN-side replica retention machinery.

> Add the ability to delay replica deletion for a period of time
> --------------------------------------------------------------
>
>                 Key: HDFS-8193
>                 URL: https://issues.apache.org/jira/browse/HDFS-8193
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: namenode
>    Affects Versions: 2.7.0
>            Reporter: Aaron T. Myers
>            Assignee: Zhe Zhang
>
> When doing maintenance on an HDFS cluster, users may be concerned about the possibility
of administrative mistakes or software bugs deleting replicas of blocks that cannot easily
be restored. It would be handy if HDFS could be made to optionally not delete any replicas
for a configurable period of time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message