hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ming Ma (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HDFS-9390) Block management for maintenance states
Date Fri, 23 Sep 2016 23:10:20 GMT

     [ https://issues.apache.org/jira/browse/HDFS-9390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Ming Ma updated HDFS-9390:
    Attachment: HDFS-9390.patch

[~eddyxu] sorry for the delay. Due to the big difference between trunk and 2.6 which the initial
patch is based on, it requires quite amount of work. Here is the draft patch. Couple notes:

* Erasure coding might need more work, at least new unit tests are required. We can use another
jira for that.
* It seems the safety properties maintained by BlockManager is implied in the code. I have
started to document more as part of this patch.
* There are other issues the patch try to fix along the way, for example {BlockManager#getRedundancy}
can be removed.

> Block management for maintenance states
> ---------------------------------------
>                 Key: HDFS-9390
>                 URL: https://issues.apache.org/jira/browse/HDFS-9390
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Ming Ma
>         Attachments: HDFS-9390.patch
> When a node is transitioned to/stay in/transitioned out of maintenance state, we need
to make sure blocks w.r.t. that nodes are properly handled.
> * When nodes are put into maintenance, it will first go to ENTERING_MAINTENANCE, and
make sure blocks are minimally replicated before the nodes are transitioned to IN_MAINTENANCE.
> * Do not replica blocks when nodes are in maintenance states. Maintenance replica will
remain in BlockMaps and thus is still considered valid from block replication point of view.
In other words, putting a node to “maintenance” mode won’t trigger BlockManager to replicate
its blocks.
> * Do not invalidate replicas on node under maintenance. After any file's replication
factor is reduced, NN needs to invalidate some replicas. It should exclude nodes under maintenance
in the handling.
> * Do not put IN_MAINTENANCE replicas in LocatedBlock for read operation.
> * Do not allocate any new block on nodes under maintenance.
> * Have Balancer exclude nodes under maintenance.
> * Exclude nodes under maintenance for DN cache.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org

View raw message