hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Juan Yu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-6908) incorrect snapshot directory diff generated by snapshot deletion
Date Fri, 22 Aug 2014 00:24:11 GMT

    [ https://issues.apache.org/jira/browse/HDFS-6908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14106231#comment-14106231
] 

Juan Yu commented on HDFS-6908:
-------------------------------

Thanks [~jingzhao].
because the directory is deleted, it means the file created between prior snapshot and the
deleting one must be deleted as well. so there are create/delete pair operations for those
files. the file diff processing part will add the file to removedINodes list. when I debug
the fix, I saw the inode for the file are deleted correctly, no leak. and the intermediate
create/delete file change is cleaned after combining the diff with prior one as well.

{code}
} else if (topNode.isFile() && topNode.asFile().isWithSnapshot()) {
        INodeFile file = topNode.asFile();
        counts.add(file.getDiffs().deleteSnapshotDiff(post, prior, file,
            collectedBlocks, removedINodes, countDiffChange));
{code}

> incorrect snapshot directory diff generated by snapshot deletion
> ----------------------------------------------------------------
>
>                 Key: HDFS-6908
>                 URL: https://issues.apache.org/jira/browse/HDFS-6908
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: snapshots
>            Reporter: Juan Yu
>            Assignee: Juan Yu
>            Priority: Critical
>         Attachments: HDFS-6908.001.patch
>
>
> In the following scenario, delete snapshot could generate incorrect snapshot directory
diff and corrupted fsimage, if you restart NN after that, you will get NullPointerException.
> 1. create a directory and create a file under it
> 2. take a snapshot
> 3. create another file under that directory
> 4. take second snapshot
> 5. delete both files and the directory
> 6. delete second snapshot
> incorrect directory diff will be generated.
> Restart NN will throw NPE
> {code}
> java.lang.NullPointerException
> 	at org.apache.hadoop.hdfs.server.namenode.snapshot.FSImageFormatPBSnapshot$Loader.addToDeletedList(FSImageFormatPBSnapshot.java:246)
> 	at org.apache.hadoop.hdfs.server.namenode.snapshot.FSImageFormatPBSnapshot$Loader.loadDeletedList(FSImageFormatPBSnapshot.java:265)
> 	at org.apache.hadoop.hdfs.server.namenode.snapshot.FSImageFormatPBSnapshot$Loader.loadDirectoryDiffList(FSImageFormatPBSnapshot.java:328)
> 	at org.apache.hadoop.hdfs.server.namenode.snapshot.FSImageFormatPBSnapshot$Loader.loadSnapshotDiffSection(FSImageFormatPBSnapshot.java:192)
> 	at org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader.loadInternal(FSImageFormatProtobuf.java:254)
> 	at org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader.load(FSImageFormatProtobuf.java:168)
> 	at org.apache.hadoop.hdfs.server.namenode.FSImageFormat$LoaderDelegator.load(FSImageFormat.java:208)
> 	at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:906)
> 	at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:892)
> 	at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImageFile(FSImage.java:715)
> 	at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:653)
> 	at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:276)
> 	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:882)
> 	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:629)
> 	at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:498)
> 	at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:554)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message