hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jing Zhao (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-5425) Renaming underconstruction file with snapshots can make NN failure on restart
Date Mon, 11 Nov 2013 18:25:18 GMT

    [ https://issues.apache.org/jira/browse/HDFS-5425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13819179#comment-13819179
] 

Jing Zhao commented on HDFS-5425:
---------------------------------

Thanks for the work Vinay and Uma!

The issue here is that we want to replace an INodeFile to an INodeFileUC. However, because
of the rename operation, the original INodeFile is actually referenced by INodeReference instances
here. So in the unit test in Vinay's patch, before the replacement, we have:
{code}
snapshot s0
deleted list: bar2 (INodeReference.WithName)
created list: bar2 (INodeReference.DstReference)
{code}
where these two bar2 instances are pointing to the same WithCount node. The WithCount node
is then pointing to the real INodeFile instance.

Thus for the replacement, we only need to let the WithCount node point to a new INodeFileUC
instance, instead of replacing the reference nodes in the diff list of s0. 


> Renaming underconstruction file with snapshots can make NN failure on restart
> -----------------------------------------------------------------------------
>
>                 Key: HDFS-5425
>                 URL: https://issues.apache.org/jira/browse/HDFS-5425
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 3.0.0, 2.2.0
>            Reporter: sathish
>            Assignee: Vinay
>         Attachments: HDFS-5425.patch, HDFS-5425.patch, HDFS-5425.patch
>
>
> I faced this When i am doing some snapshot operations like createSnapshot,renameSnapshot,i
restarted my NN,it is shutting down with exception,
> 2013-10-24 21:07:03,040 FATAL org.apache.hadoop.hdfs.server.namenode.NameNode: Exception
in namenode join
> java.lang.IllegalStateException
> 	at com.google.common.base.Preconditions.checkState(Preconditions.java:133)
> 	at org.apache.hadoop.hdfs.server.namenode.snapshot.INodeDirectoryWithSnapshot$ChildrenDiff.replace(INodeDirectoryWithSnapshot.java:82)
> 	at org.apache.hadoop.hdfs.server.namenode.snapshot.INodeDirectoryWithSnapshot$ChildrenDiff.access$700(INodeDirectoryWithSnapshot.java:62)
> 	at org.apache.hadoop.hdfs.server.namenode.snapshot.INodeDirectoryWithSnapshot$DirectoryDiffList.replaceChild(INodeDirectoryWithSnapshot.java:397)
> 	at org.apache.hadoop.hdfs.server.namenode.snapshot.INodeDirectoryWithSnapshot$DirectoryDiffList.access$900(INodeDirectoryWithSnapshot.java:376)
> 	at org.apache.hadoop.hdfs.server.namenode.snapshot.INodeDirectoryWithSnapshot.replaceChild(INodeDirectoryWithSnapshot.java:598)
> 	at org.apache.hadoop.hdfs.server.namenode.FSDirectory.unprotectedReplaceINodeFile(FSDirectory.java:1548)
> 	at org.apache.hadoop.hdfs.server.namenode.FSDirectory.replaceINodeFile(FSDirectory.java:1537)
> 	at org.apache.hadoop.hdfs.server.namenode.FSImageFormat$Loader.loadFilesUnderConstruction(FSImageFormat.java:855)
> 	at org.apache.hadoop.hdfs.server.namenode.FSImageFormat$Loader.load(FSImageFormat.java:350)
> 	at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:910)
> 	at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:899)
> 	at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImageFile(FSImage.java:751)
> 	at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:720)
> 	at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:266)
> 	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:784)
> 	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:563)
> 	at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:422)
> 	at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:472)
> 	at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:670)
> 	at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:655)
> 	at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1245)
> 	at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1311)
> 2013-10-24 21:07:03,050 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1
> 2013-10-24 21:07:03,052 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:




--
This message was sent by Atlassian JIRA
(v6.1#6144)

Mime
View raw message