hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Juan Yu (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HDFS-6908) incorrect snapshot directory diff generated by snapshot deletion
Date Thu, 21 Aug 2014 22:10:11 GMT

     [ https://issues.apache.org/jira/browse/HDFS-6908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Juan Yu updated HDFS-6908:

    Attachment: HDFS-6908.001.patch

The problem is when deleting snapshot, hdfs will clean Inode that not in any snapshot any
more. however, the logic is a bit wrong.
if an inode is directory, it calls destroyCreatedList() to put created children inodes(created
between prior snapshot and the deleting one) to removedINodes list, but also clear the createdList.
this breaks the create/delete pair operation. so later, when combine the diff with prior diff,
the delete operation stays.
I think instead of calling destroyCreatedList(), it should just do dir.removeChild(c), and
let the file diff does cleanup for files.

> incorrect snapshot directory diff generated by snapshot deletion
> ----------------------------------------------------------------
>                 Key: HDFS-6908
>                 URL: https://issues.apache.org/jira/browse/HDFS-6908
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: Juan Yu
>            Assignee: Juan Yu
>         Attachments: HDFS-6908.001.patch
> In the following scenario, delete snapshot could generate incorrect snapshot directory
diff and corrupted fsimage, if you restart NN after that, you will get NullPointerException.
> 1. create a directory and create a file under it
> 2. take a snapshot
> 3. create another file under that directory
> 4. take second snapshot
> 5. delete both files and the directory
> 6. delete second snapshot
> incorrect directory diff will be generated.
> Restart NN will throw NPE
> {code}
> java.lang.NullPointerException
> 	at org.apache.hadoop.hdfs.server.namenode.snapshot.FSImageFormatPBSnapshot$Loader.addToDeletedList(FSImageFormatPBSnapshot.java:246)
> 	at org.apache.hadoop.hdfs.server.namenode.snapshot.FSImageFormatPBSnapshot$Loader.loadDeletedList(FSImageFormatPBSnapshot.java:265)
> 	at org.apache.hadoop.hdfs.server.namenode.snapshot.FSImageFormatPBSnapshot$Loader.loadDirectoryDiffList(FSImageFormatPBSnapshot.java:328)
> 	at org.apache.hadoop.hdfs.server.namenode.snapshot.FSImageFormatPBSnapshot$Loader.loadSnapshotDiffSection(FSImageFormatPBSnapshot.java:192)
> 	at org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader.loadInternal(FSImageFormatProtobuf.java:254)
> 	at org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader.load(FSImageFormatProtobuf.java:168)
> 	at org.apache.hadoop.hdfs.server.namenode.FSImageFormat$LoaderDelegator.load(FSImageFormat.java:208)
> 	at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:906)
> 	at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:892)
> 	at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImageFile(FSImage.java:715)
> 	at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:653)
> 	at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:276)
> 	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:882)
> 	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:629)
> 	at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:498)
> 	at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:554)
> {code}

This message was sent by Atlassian JIRA

View raw message