hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ming Ma (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HDFS-7541) Support for fast HDFS datanode rolling upgrade
Date Wed, 17 Dec 2014 19:57:15 GMT

     [ https://issues.apache.org/jira/browse/HDFS-7541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Ming Ma updated HDFS-7541:
    Attachment: SupportforfastHDFSdatanoderollingupgrade.pdf

We ([~ctrezzo], [~jmeagher], [~lohit], [~l201514] and [~kihwal] and others) discussed ways
to address this. Attached is the initial high level design document.

* Upgrade domain support. HDFS-3566 outlines the idea, but it isn't applicable to hadoop 2
and it uses network topology to store upgrade domain definition. We can make load balancer
to be more extensible to support different policies.

* Have NN support for new "maintenance" datanode state. Under this state, the DN won't process
read/write requests; But its replica will remains in BlockMaps and thus is still considered
valid from block replication point of view.

Appreciate any input.

> Support for fast HDFS datanode rolling upgrade
> ----------------------------------------------
>                 Key: HDFS-7541
>                 URL: https://issues.apache.org/jira/browse/HDFS-7541
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Ming Ma
>         Attachments: SupportforfastHDFSdatanoderollingupgrade.pdf
> Current HDFS DN rolling upgrade step requires sequential DN restart to minimize the impact
on data availability and read/write operations. The side effect is longer upgrade duration
for large clusters. This might be acceptable for DN JVM quick restart to update hadoop code/configuration.
However, for OS upgrade that requires machine reboot, the overall upgrade duration will be
too long if we continue to do sequential DN rolling restart.

This message was sent by Atlassian JIRA

View raw message