hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jing Zhao (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-5982) Restarting Namenode can have NPE.
Date Thu, 20 Feb 2014 06:12:19 GMT

    [ https://issues.apache.org/jira/browse/HDFS-5982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13906657#comment-13906657
] 

Jing Zhao commented on HDFS-5982:
---------------------------------

NPE hit by [~tassapola]'s test:
{code}
2014-02-19 01:59:04,616 FATAL org.apache.hadoop.hdfs.server.namenode.NameNode: Exception in
namenode join
java.lang.NullPointerException
	at org.apache.hadoop.hdfs.server.namenode.snapshot.SnapshotFSImageFormat.loadFileDiff(SnapshotFSImageFormat.java:131)
	at org.apache.hadoop.hdfs.server.namenode.snapshot.SnapshotFSImageFormat.loadFileDiffList(SnapshotFSImageFormat.java:111)
	at org.apache.hadoop.hdfs.server.namenode.FSImageFormat$Loader.loadINode(FSImageFormat.java:688)
	at org.apache.hadoop.hdfs.server.namenode.FSImageFormat$Loader.loadINodeWithLocalName(FSImageFormat.java:636)
	at org.apache.hadoop.hdfs.server.namenode.FSImageFormat$Loader.loadChildren(FSImageFormat.java:468)
	at org.apache.hadoop.hdfs.server.namenode.FSImageFormat$Loader.loadDirectoryWithSnapshot(FSImageFormat.java:510)
	at org.apache.hadoop.hdfs.server.namenode.FSImageFormat$Loader.loadDirectoryWithSnapshot(FSImageFormat.java:519)
	at org.apache.hadoop.hdfs.server.namenode.FSImageFormat$Loader.loadDirectoryWithSnapshot(FSImageFormat.java:519)
	at org.apache.hadoop.hdfs.server.namenode.FSImageFormat$Loader.loadDirectoryWithSnapshot(FSImageFormat.java:519)
	at org.apache.hadoop.hdfs.server.namenode.FSImageFormat$Loader.loadDirectoryWithSnapshot(FSImageFormat.java:519)
	at org.apache.hadoop.hdfs.server.namenode.FSImageFormat$Loader.loadLocalNameINodesWithSnapshot(FSImageFormat.java:412)
	at org.apache.hadoop.hdfs.server.namenode.FSImageFormat$Loader.load(FSImageFormat.java:350)
	at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:832)
	at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:821)
	at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImageFile(FSImage.java:669)
	at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:638)
	at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:265)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:856)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:616)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:434)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:490)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:646)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:631)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1270)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1336)
2014-02-19 01:59:04,619 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1
{code}

> Restarting Namenode can have NPE.
> ---------------------------------
>
>                 Key: HDFS-5982
>                 URL: https://issues.apache.org/jira/browse/HDFS-5982
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode
>    Affects Versions: 2.3.0
>            Reporter: Tassapol Athiapinya
>            Assignee: Jing Zhao
>            Priority: Critical
>             Fix For: 2.3.0
>
>
> Currently after deleting a snapshottable directory which does not have snapshots any
more, we also remove the directory from the snapshottable directory list in SnapshotManager.
This works fine when handling a delete request from user. However, when we apply the OP_DELETE
editlog, FSDirectory#unprotectedDelete(String, long) is called, which does not contain the
"updating snapshot manager" process. This may leave an non-existent inode id in the snapshottable
directory list, and can even lead to FSImage corruption.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Mime
View raw message