hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stephen O'Donnell (Jira)" <j...@apache.org>
Subject [jira] [Updated] (HDFS-14854) Create improved decommission monitor implementation
Date Tue, 01 Oct 2019 14:05:00 GMT

     [ https://issues.apache.org/jira/browse/HDFS-14854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Stephen O'Donnell updated HDFS-14854:
    Attachment: HDFS-14854.004.patch

> Create improved decommission monitor implementation
> ---------------------------------------------------
>                 Key: HDFS-14854
>                 URL: https://issues.apache.org/jira/browse/HDFS-14854
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: namenode
>    Affects Versions: 3.3.0
>            Reporter: Stephen O'Donnell
>            Assignee: Stephen O'Donnell
>            Priority: Major
>         Attachments: Decommission_Monitor_V2_001.pdf, HDFS-14854.001.patch, HDFS-14854.002.patch,
HDFS-14854.003.patch, HDFS-14854.004.patch
> In HDFS-13157, we discovered a series of problems with the current decommission monitor
implementation, such as:
>  * Blocks are replicated sequentially disk by disk and node by node, and hence the load
is not spread well across the cluster
>  * Adding a node for decommission can cause the namenode write lock to be held for a
long time.
>  * Decommissioning nodes floods the replication queue and under replicated blocks from
a future node or disk failure may way for a long time before they are replicated.
>  * Blocks pending replication are checked many times under a write lock before they are
sufficiently replicate, wasting resources
> In this Jira I propose to create a new implementation of the decommission monitor that
resolves these issues. As it will be difficult to prove one implementation is better than
another, the new implementation can be enabled or disabled giving the option of the existing
implementation or the new one.
> I will attach a pdf with some more details on the design and then a version 1 patch shortly.

This message was sent by Atlassian Jira

To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org

View raw message