hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aaron T. Myers (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-4462) 2NN will fail to checkpoint after an HDFS upgrade from a pre-federation version of HDFS
Date Fri, 01 Feb 2013 18:36:12 GMT

    [ https://issues.apache.org/jira/browse/HDFS-4462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13568946#comment-13568946
] 

Aaron T. Myers commented on HDFS-4462:
--------------------------------------

[~acmurthy] Blocker? Probably not. Pretty good to have? I think so. There's a pretty simple
work-around: when upgrading from a pre-federation version of HDFS, blow away your 2NN checkpoint
dirs before starting up your 2NN again. A problem will arise if an admin doesn't notice that
all of their 2NN checkpoints are failing post-upgrade.

Regardless, it's a pretty simple change - I'm hoping it can get committed today.
                
> 2NN will fail to checkpoint after an HDFS upgrade from a pre-federation version of HDFS
> ---------------------------------------------------------------------------------------
>
>                 Key: HDFS-4462
>                 URL: https://issues.apache.org/jira/browse/HDFS-4462
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode
>    Affects Versions: 2.0.2-alpha
>            Reporter: Aaron T. Myers
>            Assignee: Aaron T. Myers
>         Attachments: HDFS-4462.patch, HDFS-4462.patch
>
>
> The 2NN currently has logic to detect when its on-disk FS metadata needs an upgrade with
respect to the NN's metadata (i.e. the layout versions are different) and in this case it
will proceed with the checkpoint despite storage signatures not matching precisely if the
BP ID and Cluster ID do match exactly. However, in situations where we're upgrading from versions
of HDFS prior to federation, which had no BP IDs or Cluster IDs, checkpoints will always fail
with an error like the following:
> {noformat}
> 13/01/31 17:02:25 ERROR namenode.SecondaryNameNode: checkpoint: Inconsistent checkpoint
fields.
> LV = -40 namespaceID = 403832480 cTime = 1359680537192 ; clusterId = CID-0df6ff22-1165-4c7d-9630-429972a7737c
; blockpoolId = BP-1520616013-172.21.3.106-1359680537136.
> Expecting respectively: -19; 403832480; 0; ; .
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message