hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Suresh Srinivas (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HADOOP-8371) Hadoop 1.0.1 release - DFS rollback issues
Date Tue, 08 May 2012 18:45:48 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-8371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Suresh Srinivas updated HADOOP-8371:
------------------------------------

    Description: See the next comment for details.  (was: h1.Test Setup
All tests were done on a single node cluster, that runs namenode, secondarynamenode, datanode,
all on one machine, running Ubuntu
12.04.
/usr/local/hadoop/ is a soft link to /usr/local/hadoop-0.20.203.0/
/usr/local/hadoop-1.0.1 contains the upgrade version.
h1.Version - 0.20.203.0
* Formatted name node.
* Contents of {dfs.name.dir}/current/VERSION
{quote}
Tue May 08 08:08:57 EDT 2012
namespaceID=350250898
cTime=0
storageType=NAME_NODE
layoutVersion=-31
{quote}
* Contents of {dfs.name.dir}/previous.checkpoint/VERSION
{quote}
Tue May 08 08:03:35 EDT 2012
namespaceID=350250898
cTime=0
storageType=NAME_NODE
layoutVersion=-31
{quote}
* Copied a few test files into HDFS.
* Output from "fs -lsr /" command
{quote}
hduser@ruff790:/usr/local/hadoop/bin$ ./hadoop dfs -lsr /
drwxr-xr-x - hduser supergroup 0 2012-05-08 08:04 /test
-rw-r--r-- 1 hduser supergroup 27574849 2012-05-08 08:04 /test/rr_archive_1655003175_1660003165.gz
-rw-r--r-- 1 hduser supergroup 18065179 2012-05-08 08:04 /test/twonkyportal.log.2011-12-03.rr.gz
drwxr-xr-x - hduser supergroup 0 2012-05-08 08:04 /user
drwxr-xr-x - hduser supergroup 0 2012-05-08 08:04 /user/hduser
{quote}
* Executed "hadoop dfsadmin -finalizeUpgrade" (I do not think this is required, but i do not
think it should matter either).
* Stopped DFS by executing "stop-dfs.sh"

h1. Version - 1.0.1
h2. Upgrade
* Tried starting DFS by running /usr/local/hadoop-1.0.1/bin/start-dfs.sh
* As expected the name node start failed due to a version mismatch.
{quote}
2012-05-08 08:22:38,166 ERROR org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem
initialization failed.
java.io.IOException:
File system image contains an old layout version -31.
An upgrade to version -32 is required.
Please restart NameNode with -upgrade option.
{quote}
* Ran /usr/local/hadoop-1.0.1/bin/stop-dfs.sh to stop datanode and secondarynamenode.
* Started DFS by running /usr/local/hadoop-1.0.1/bin/start-dfs.sh -upgrade
* Checked upgrade status by calling /usr/local/hadoop-1.0.1/bin/hadoop dfsadmin -upgradeProgress
status
{quote}
Upgrade for version -32 has been completed.
Upgrade is not finalized.
{quote}
* Contents of {dfs.name.dir}/current/VERSION
{quote}
#Tue May 08 08:25:51 EDT 2012
namespaceID=350250898
cTime=1336479951669
storageType=NAME_NODE
layoutVersion=-32
{quote}
* Contents of {dfs.name.dir}/previous.checkpoint/VERSION
{quote}
Tue May 08 08:03:35 EDT 2012
namespaceID=350250898
cTime=0
storageType=NAME_NODE
layoutVersion=-31
{quote}
* Contents of {dfs.name.dir}/previous/VERSION
{quote}
#Tue May 08 08:08:57 EDT 2012
namespaceID=350250898
cTime=0
storageType=NAME_NODE
layoutVersion=-31
{quote}
* Checked to make sure i can list the contents of DFS
* Stop DFS.

h2.Rollback
* Started DFS by running /usr/local/hadoop-1.0.1/bin/start-dfs.sh -rollback
* As per contents of "hadoop-hduser-namenode-ruff790.log", rollback seems to have succeeded.
{quote}
012-05-08 08:37:41,799 INFO org.apache.hadoop.hdfs.server.common.Storage: Rolling back storage
directory /usr/local/app/hadoop/tmp/dfs/name.
new LV = -31; new CTime = 0
2012-05-08 08:37:41,801 INFO org.apache.hadoop.hdfs.server.common.Storage: Rollback of
/usr/local/app/hadoop/tmp/dfs/name is complete.
{quote}
* Contents of {dfs.name.dir}/current/VERSION
{quote}
Tue May 08 08:37:42 EDT 2012
namespaceID=350250898
cTime=0
storageType=NAME_NODE
layoutVersion=-31
{quote}
* Contents of {dfs.name.dir}/previous.checkpoint/VERSION
{quote}
#Tue May 08 08:08:57 EDT 2012
namespaceID=350250898
cTime=0
storageType=NAME_NODE
layoutVersion=-31
{quote}
* Checked to make sure i can list the contents of DFS
{quote}
hduser@ruff790:/usr/local/hadoop-1.0.1/bin$ ./hadoop dfs -lsr /
drwxr-xr-x - hduser supergroup 0 2012-05-08 08:04 /test
-rw-r--r-- 1 hduser supergroup 27574849 2012-05-08 08:04 /test/rr_archive_1655003175_1660003165.gz
-rw-r--r-- 1 hduser supergroup 18065179 2012-05-08 08:04 /test/twonkyportal.log.2011-12-03.rr.gz
drwxr-xr-x - hduser supergroup 0 2012-05-08 08:04 /user
drwxr-xr-x - hduser supergroup 0 2012-05-08 08:04 /user/hduser
{quote}
* However at this point i could not browse the file system from WebUI. Then i realized that
data node is not really running. From the data
node log file it seems like it had shut down during the rollback process.
{quote}
012-05-08 08:37:57,953 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: DataNode is shutting
down: org.apache.hadoop.ipc.RemoteException:
org.apache.hadoop.hdfs.protocol.UnregisteredDatanodeException: Unregistered data node:
127.0.0.1:50010
at org.apache.hadoop.hdfs.server.namenode.NameNode.verifyRequest(NameNode.java:1077)
{quote}
* So i ran "stop-dfs.sh" to shut down namnode and secondarynamenode.
* Next "start-dfs.sh" fails to start the name node, as expected, with a version mismatch error.
{quote}
2012-05-08 08:50:51,084 ERROR org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem
initialization failed.
java.io.IOException:
File system image contains an old layout version -31.
An upgrade to version -32 is required.
Please restart NameNode with -upgrade option.
{quote}
* Shut everything down and go back to the old version.

h1. Version - 0.20.203.0 (Again)
* Now that i have rolled back the "1.0.1" upgrade i thought i could go back to version 0.20.203.0
* So i go back and run /usr/local/hadoop/bin/start-dfs.sh and namenode does not start up.
It fails with error message:
{quote}
2012-05-08 08:57:09,261 ERROR org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem
initialization failed.
java.io.IOException: Unexpected version of the file system log file: -32. Current version
= -31.
{quote})

h1.Test Setup
All tests were done on a single node cluster, that runs namenode, secondarynamenode, datanode,
all on one machine, running Ubuntu
12.04.
/usr/local/hadoop/ is a soft link to /usr/local/hadoop-0.20.203.0/
/usr/local/hadoop-1.0.1 contains the upgrade version.
h1.Version - 0.20.203.0
* Formatted name node.
* Contents of {dfs.name.dir}/current/VERSION
{quote}
Tue May 08 08:08:57 EDT 2012
namespaceID=350250898
cTime=0
storageType=NAME_NODE
layoutVersion=-31
{quote}
* Contents of {dfs.name.dir}/previous.checkpoint/VERSION
{quote}
Tue May 08 08:03:35 EDT 2012
namespaceID=350250898
cTime=0
storageType=NAME_NODE
layoutVersion=-31
{quote}
* Copied a few test files into HDFS.
* Output from "fs -lsr /" command
{quote}
hduser@ruff790:/usr/local/hadoop/bin$ ./hadoop dfs -lsr /
drwxr-xr-x - hduser supergroup 0 2012-05-08 08:04 /test
-rw-r--r-- 1 hduser supergroup 27574849 2012-05-08 08:04 /test/rr_archive_1655003175_1660003165.gz
-rw-r--r-- 1 hduser supergroup 18065179 2012-05-08 08:04 /test/twonkyportal.log.2011-12-03.rr.gz
drwxr-xr-x - hduser supergroup 0 2012-05-08 08:04 /user
drwxr-xr-x - hduser supergroup 0 2012-05-08 08:04 /user/hduser
{quote}
* Executed "hadoop dfsadmin -finalizeUpgrade" (I do not think this is required, but i do not
think it should matter either).
* Stopped DFS by executing "stop-dfs.sh"

h1. Version - 1.0.1
h2. Upgrade
* Tried starting DFS by running /usr/local/hadoop-1.0.1/bin/start-dfs.sh
* As expected the name node start failed due to a version mismatch.
{quote}
2012-05-08 08:22:38,166 ERROR org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem
initialization failed.
java.io.IOException:
File system image contains an old layout version -31.
An upgrade to version -32 is required.
Please restart NameNode with -upgrade option.
{quote}
* Ran /usr/local/hadoop-1.0.1/bin/stop-dfs.sh to stop datanode and secondarynamenode.
* Started DFS by running /usr/local/hadoop-1.0.1/bin/start-dfs.sh -upgrade
* Checked upgrade status by calling /usr/local/hadoop-1.0.1/bin/hadoop dfsadmin -upgradeProgress
status
{quote}
Upgrade for version -32 has been completed.
Upgrade is not finalized.
{quote}
* Contents of {dfs.name.dir}/current/VERSION
{quote}
#Tue May 08 08:25:51 EDT 2012
namespaceID=350250898
cTime=1336479951669
storageType=NAME_NODE
layoutVersion=-32
{quote}
* Contents of {dfs.name.dir}/previous.checkpoint/VERSION
{quote}
Tue May 08 08:03:35 EDT 2012
namespaceID=350250898
cTime=0
storageType=NAME_NODE
layoutVersion=-31
{quote}
* Contents of {dfs.name.dir}/previous/VERSION
{quote}
#Tue May 08 08:08:57 EDT 2012
namespaceID=350250898
cTime=0
storageType=NAME_NODE
layoutVersion=-31
{quote}
* Checked to make sure i can list the contents of DFS
* Stop DFS.

h2.Rollback
* Started DFS by running /usr/local/hadoop-1.0.1/bin/start-dfs.sh -rollback
* As per contents of "hadoop-hduser-namenode-ruff790.log", rollback seems to have succeeded.
{quote}
012-05-08 08:37:41,799 INFO org.apache.hadoop.hdfs.server.common.Storage: Rolling back storage
directory /usr/local/app/hadoop/tmp/dfs/name.
new LV = -31; new CTime = 0
2012-05-08 08:37:41,801 INFO org.apache.hadoop.hdfs.server.common.Storage: Rollback of
/usr/local/app/hadoop/tmp/dfs/name is complete.
{quote}
* Contents of {dfs.name.dir}/current/VERSION
{quote}
Tue May 08 08:37:42 EDT 2012
namespaceID=350250898
cTime=0
storageType=NAME_NODE
layoutVersion=-31
{quote}
* Contents of {dfs.name.dir}/previous.checkpoint/VERSION
{quote}
#Tue May 08 08:08:57 EDT 2012
namespaceID=350250898
cTime=0
storageType=NAME_NODE
layoutVersion=-31
{quote}
* Checked to make sure i can list the contents of DFS
{quote}
hduser@ruff790:/usr/local/hadoop-1.0.1/bin$ ./hadoop dfs -lsr /
drwxr-xr-x - hduser supergroup 0 2012-05-08 08:04 /test
-rw-r--r-- 1 hduser supergroup 27574849 2012-05-08 08:04 /test/rr_archive_1655003175_1660003165.gz
-rw-r--r-- 1 hduser supergroup 18065179 2012-05-08 08:04 /test/twonkyportal.log.2011-12-03.rr.gz
drwxr-xr-x - hduser supergroup 0 2012-05-08 08:04 /user
drwxr-xr-x - hduser supergroup 0 2012-05-08 08:04 /user/hduser
{quote}
* However at this point i could not browse the file system from WebUI. Then i realized that
data node is not really running. From the data
node log file it seems like it had shut down during the rollback process.
{quote}
012-05-08 08:37:57,953 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: DataNode is shutting
down: org.apache.hadoop.ipc.RemoteException:
org.apache.hadoop.hdfs.protocol.UnregisteredDatanodeException: Unregistered data node:
127.0.0.1:50010
at org.apache.hadoop.hdfs.server.namenode.NameNode.verifyRequest(NameNode.java:1077)
{quote}
* So i ran "stop-dfs.sh" to shut down namnode and secondarynamenode.
* Next "start-dfs.sh" fails to start the name node, as expected, with a version mismatch error.
{quote}
2012-05-08 08:50:51,084 ERROR org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem
initialization failed.
java.io.IOException:
File system image contains an old layout version -31.
An upgrade to version -32 is required.
Please restart NameNode with -upgrade option.
{quote}
* Shut everything down and go back to the old version.

h1. Version - 0.20.203.0 (Again)
* Now that i have rolled back the "1.0.1" upgrade i thought i could go back to version 0.20.203.0
* So i go back and run /usr/local/hadoop/bin/start-dfs.sh and namenode does not start up.
It fails with error message:
{quote}
2012-05-08 08:57:09,261 ERROR org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem
initialization failed.
java.io.IOException: Unexpected version of the file system log file: -32. Current version
= -31.
{quote}
                
> Hadoop 1.0.1 release - DFS rollback issues
> ------------------------------------------
>
>                 Key: HADOOP-8371
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8371
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 1.0.1
>         Environment: All tests were done on a single node cluster, that runs namenode,
secondarynamenode, datanode, all on one machine, running Ubuntu 12.04
>            Reporter: Giri
>            Priority: Minor
>              Labels: hdfs
>
> See the next comment for details.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message