hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Christian Kunz (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-1664) Hadoop DFS upgrade prcoedure
Date Mon, 30 Jul 2007 19:15:53 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-1664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12516488
] 

Christian Kunz commented on HADOOP-1664:
----------------------------------------

Datanode servers were apparently successful in upgrading:
...
2007-07-26 10:35:34,973 INFO org.apache.hadoop.dfs.DataNode:
   Distributed upgrade for DataNode version -6 to current LV -7 is initialized.
2007-07-26 10:35:34,974 INFO org.apache.hadoop.dfs.Storage: Upgrading storage directory <hadoop-dir>/dfs/data.
   old LV = -5; old CTime = 1183153812398.
   new LV = -7; new CTime = 1185471333047
2007-07-26 10:36:58,098 INFO org.apache.hadoop.dfs.Storage: Upgrade of /<hadoop-dir>/dfs/data
is complete.
2007-07-26 10:36:58,587 INFO org.apache.hadoop.dfs.DataNode: Opened server at 50010
...

but namenode server reported 0% upgrade long after that:

2007-07-26 10:43:04,818 INFO org.apache.hadoop.dfs.BlockCrcUpgradeNamenode: Upgrade still
running.
                                 Avg completion on Datanodes: 0.00% with 0 errors.

Even after 40 minutes no change in report status, namenode was still in safe mode, and if
I wanted to force it to leave safe mode, it refused:

hadoop dfsadmin -safemode leave
safemode: org.apache.hadoop.dfs.SafeModeException: Distributed upgrade is in progress. Name
node is in safe mode.



> Hadoop DFS upgrade prcoedure
> ----------------------------
>
>                 Key: HADOOP-1664
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1664
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: dfs
>    Affects Versions: 0.14.0
>            Reporter: Christian Kunz
>
> When upgrading from a July-9  to a July-25 nightly release, we are able to upgrade successfully
on a single-node cluster, but failed on a 10 and a 200 node cluster.
> As it is not sure whether we made a mistake or not, I file this as an improvement. But
going forward it is imperative that there is a safe and well-documented procedure to upgrade
dfs without loss of data, including a rollback procedure and listing of operational procedures
that are irreversibly destructive (hopefully an empty list).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message