hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kihwal Lee (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-5535) Umbrella jira for improved HDFS rolling upgrades
Date Wed, 15 Jan 2014 17:00:25 GMT

    [ https://issues.apache.org/jira/browse/HDFS-5535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13872269#comment-13872269

Kihwal Lee commented on HDFS-5535:

bq. The total time required to upgrade a cluster MUST not exceed #Nodes_in_cluster * 10 seconds.
This is about how fast the upgrade process can go while minimally impacting service and data
availability. Please note that this is a requirement for the upgrade feature. It does not
dictate what users should do.  This requirement exists mainly to help users estimate how soon
a cluster can be upgraded and also force us to guarantee estimates stay valid in the future.

bq. Probably meant to say that old software should be able to support whatever state of the
file system left after the upgrade experiment was terminated?
I know you didn't intended it to be, but this sounds like the requirement is reduced to maintaining
file system integrity. It could simply be "Data durability must not be compromised by upgrades
or downgrades".

bq. May be it needs to roll edits in some special way to indicate the start of the rolling
I believe this came up during discussions, but do not remember the conclusion.  We will clarify

bq. What is MTTR?
Mean time to recovery.

bq. Looks like Lite-Decom and “Optimizing DN Restart time” are competing proposals
Yes, indeed. We will do the latter, which will be more in-line with existing tool-driven approaches.
 Lite-Decom may be considered in later development phases for other use cases(e.g. the case
Ming Ma mentioned above), but regular DN rolling upgrade won't depend on it.  

> Umbrella jira for improved HDFS rolling upgrades
> ------------------------------------------------
>                 Key: HDFS-5535
>                 URL: https://issues.apache.org/jira/browse/HDFS-5535
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: datanode, ha, hdfs-client, namenode
>    Affects Versions: 3.0.0, 2.2.0
>            Reporter: Nathan Roberts
>         Attachments: HDFSRollingUpgradesHighLevelDesign.pdf
> In order to roll a new HDFS release through a large cluster quickly and safely, a few
enhancements are needed in HDFS. An initial High level design document will be attached to
this jira, and sub-jiras will itemize the individual tasks.

This message was sent by Atlassian JIRA

View raw message