hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jing Zhao (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HDFS-6135) In HDFS upgrade with HA setup, JournalNode cannot handle layout version bump when rolling back
Date Mon, 24 Mar 2014 04:27:46 GMT

     [ https://issues.apache.org/jira/browse/HDFS-6135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Jing Zhao updated HDFS-6135:
----------------------------

    Attachment: HDFS-6135.002.patch

Another option is to completely ignore the future layouversion check in Journal Node. Since
now JN no longer decode the edits (but using the length field to scan through), JN has some
capability to handle the edits with future layoutVersion. 

However, this capability is limited, e.g., in the future if something more significant (e.g.,
the edits segment-based maintenance mechanism) gets updated, old software will not be able
to process edits generated by new software. But maybe a better solution is to split the NameNode
layouversion into NameNode layoutversion and JournalNode layoutversion, just like what we
did for NN and DN.

> In HDFS upgrade with HA setup, JournalNode cannot handle layout version bump when rolling
back
> ----------------------------------------------------------------------------------------------
>
>                 Key: HDFS-6135
>                 URL: https://issues.apache.org/jira/browse/HDFS-6135
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 3.0.0
>            Reporter: Jing Zhao
>            Assignee: Jing Zhao
>            Priority: Blocker
>         Attachments: HDFS-6135.000.patch, HDFS-6135.001.patch, HDFS-6135.002.patch, HDFS-6135.test.txt
>
>
> While doing HDFS upgrade with HA setup, if the layoutversion gets changed in the upgrade,
the rollback may trigger the following exception in JournalNodes (suppose the new software
bumped the layoutversion from -55 to -56):
> {code}
> 14/03/21 01:01:53 FATAL namenode.NameNode: Exception in namenode join
> org.apache.hadoop.hdfs.qjournal.client.QuorumException: Could not check if roll back
possible for one or more JournalNodes. 1 exceptions thrown:
> Unexpected version of storage directory /grid/1/tmp/journal/mycluster. Reported: -56.
Expecting = -55.
> 	at org.apache.hadoop.hdfs.server.common.StorageInfo.setLayoutVersion(StorageInfo.java:203)
> 	at org.apache.hadoop.hdfs.server.common.StorageInfo.setFieldsFromProperties(StorageInfo.java:156)
> 	at org.apache.hadoop.hdfs.server.common.StorageInfo.readProperties(StorageInfo.java:135)
> 	at org.apache.hadoop.hdfs.qjournal.server.JNStorage.analyzeStorage(JNStorage.java:202)
> 	at org.apache.hadoop.hdfs.qjournal.server.JNStorage.<init>(JNStorage.java:73)
> 	at org.apache.hadoop.hdfs.qjournal.server.Journal.<init>(Journal.java:142)
> 	at org.apache.hadoop.hdfs.qjournal.server.JournalNode.getOrCreateJournal(JournalNode.java:87)
> 	at org.apache.hadoop.hdfs.qjournal.server.JournalNode.canRollBack(JournalNode.java:304)
> 	at org.apache.hadoop.hdfs.qjournal.server.JournalNodeRpcServer.canRollBack(JournalNodeRpcServer.java:228)
> {code}
> Looks like for rollback JN with old software cannot handle future layoutversion brought
by new software.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message