hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hudson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-13120) Snapshot diff could be corrupted after concat
Date Thu, 08 Feb 2018 18:38:00 GMT

    [ https://issues.apache.org/jira/browse/HDFS-13120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16357376#comment-16357376
] 

Hudson commented on HDFS-13120:
-------------------------------

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13633 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/13633/])
HDFS-13120. Snapshot diff could be corrupted after concat. Contributed (xyao: rev 8faf0b50d435039f69ea35f592856ca04d378809)
* (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirConcatOp.java
* (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/snapshot/TestSnapshotDeletion.java


> Snapshot diff could be corrupted after concat
> ---------------------------------------------
>
>                 Key: HDFS-13120
>                 URL: https://issues.apache.org/jira/browse/HDFS-13120
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode, snapshots
>    Affects Versions: 2.7.0
>            Reporter: Xiaoyu Yao
>            Assignee: Xiaoyu Yao
>            Priority: Major
>         Attachments: HDFS-13120.001.patch, HDFS-13120.002.patch
>
>
> The snapshot diff can be corrupted after concat files. This could lead to Assertion upon
DeleteSnapshot and getSnapshotDiff operations later. 
> For example, we have seen customers hit stack trace similar to the one below but during
loading edit entry of DeleteSnapshotOp. After the investigation, we found this is a regression
caused by HDFS-3689 where the snapshot diff is not fully cleaned up after concat. 
> I will post the unit test to repro this and fix for it shortly.
> {code}
> org.apache.hadoop.ipc.RemoteException(java.lang.AssertionError): Element already exists:
element=0.txt, CREATED=[0.txt, 1.txt, 2.txt]
> 	at org.apache.hadoop.hdfs.util.Diff.insert(Diff.java:196)
> 	at org.apache.hadoop.hdfs.util.Diff.create(Diff.java:216)
> 	at org.apache.hadoop.hdfs.util.Diff.combinePosterior(Diff.java:463)
> 	at org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature$DirectoryDiff.combinePosteriorAndCollectBlocks(DirectoryWithSnapshotFeature.java:205)
> 	at org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature$DirectoryDiff.combinePosteriorAndCollectBlocks(DirectoryWithSnapshotFeature.java:162)
> 	at org.apache.hadoop.hdfs.server.namenode.snapshot.AbstractINodeDiffList.deleteSnapshotDiff(AbstractINodeDiffList.java:100)
> 	at org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature.cleanDirectory(DirectoryWithSnapshotFeature.java:728)
> 	at org.apache.hadoop.hdfs.server.namenode.INodeDirectory.cleanSubtree(INodeDirectory.java:830)
> 	at org.apache.hadoop.hdfs.server.namenode.snapshot.DirectorySnapshottableFeature.removeSnapshot(DirectorySnapshottableFeature.java:237)
> 	at org.apache.hadoop.hdfs.server.namenode.INodeDirectory.removeSnapshot(INodeDirectory.java:292)
> 	at org.apache.hadoop.hdfs.server.namenode.snapshot.SnapshotManager.deleteSnapshot(SnapshotManager.java:321)
> 	at org.apache.hadoop.hdfs.server.namenode.FSDirSnapshotOp.deleteSnapshot(FSDirSnapshotOp.java:249)
> 	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteSnapshot(FSNamesystem.java:6566)
> 	at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.deleteSnapshot(NameNodeRpcServer.java:1823)
> 	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.deleteSnapshot(ClientNamenodeProtocolServerSideTranslatorPB.java:1200)
> 	at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
> 	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
> 	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1007)
> 	at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:873)
> 	at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:819)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:422)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1965)
> 	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2679)
> {code} 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message