From hdfs-issues-return-209747-archive-asf-public=cust-asf.ponee.io@hadoop.apache.org Thu Feb 8 01:15:05 2018 Return-Path: X-Original-To: archive-asf-public@eu.ponee.io Delivered-To: archive-asf-public@eu.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by mx-eu-01.ponee.io (Postfix) with ESMTP id 5F5E718065B for ; Thu, 8 Feb 2018 01:15:05 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 4F7ED160C5D; Thu, 8 Feb 2018 00:15:05 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 98CD9160C5B for ; Thu, 8 Feb 2018 01:15:04 +0100 (CET) Received: (qmail 65385 invoked by uid 500); 8 Feb 2018 00:15:03 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 65364 invoked by uid 99); 8 Feb 2018 00:15:03 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 08 Feb 2018 00:15:03 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 374B5C0042 for ; Thu, 8 Feb 2018 00:15:03 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -110.311 X-Spam-Level: X-Spam-Status: No, score=-110.311 tagged_above=-999 required=6.31 tests=[ENV_AND_HDR_SPF_MATCH=-0.5, RCVD_IN_DNSWL_MED=-2.3, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01, USER_IN_DEF_SPF_WL=-7.5, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id 3UJZ6VFdGoXN for ; Thu, 8 Feb 2018 00:15:02 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id A68575F2F0 for ; Thu, 8 Feb 2018 00:15:01 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id B0F9DE015E for ; Thu, 8 Feb 2018 00:15:00 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 1B0F821E85 for ; Thu, 8 Feb 2018 00:15:00 +0000 (UTC) Date: Thu, 8 Feb 2018 00:15:00 +0000 (UTC) From: "Xiaoyu Yao (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Created] (HDFS-13120) Snapshot diff could be corrupted after concat MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 Xiaoyu Yao created HDFS-13120: --------------------------------- Summary: Snapshot diff could be corrupted after concat Key: HDFS-13120 URL: https://issues.apache.org/jira/browse/HDFS-13120 Project: Hadoop HDFS Issue Type: Bug Components: namenode, snapshots Reporter: Xiaoyu Yao Assignee: Xiaoyu Yao The snapshot diff can be corrupted after concat files. This could lead to Assertion upon DeleteSnapshot and getSnapshotDiff operations later. For example, we have seen customers hit stack trace similar to the one below but during loading edit entry of DeleteSnapshotOp. After the investigation, we found this is a regression caused by HDFS-3689 where the snapshot diff is not fully cleaned up after concat. I will post the unit test to repro this and fix for it shortly. {code} org.apache.hadoop.ipc.RemoteException(java.lang.AssertionError): Element already exists: element=0.txt, CREATED=[0.txt, 1.txt, 2.txt] at org.apache.hadoop.hdfs.util.Diff.insert(Diff.java:196) at org.apache.hadoop.hdfs.util.Diff.create(Diff.java:216) at org.apache.hadoop.hdfs.util.Diff.combinePosterior(Diff.java:463) at org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature$DirectoryDiff.combinePosteriorAndCollectBlocks(DirectoryWithSnapshotFeature.java:205) at org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature$DirectoryDiff.combinePosteriorAndCollectBlocks(DirectoryWithSnapshotFeature.java:162) at org.apache.hadoop.hdfs.server.namenode.snapshot.AbstractINodeDiffList.deleteSnapshotDiff(AbstractINodeDiffList.java:100) at org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature.cleanDirectory(DirectoryWithSnapshotFeature.java:728) at org.apache.hadoop.hdfs.server.namenode.INodeDirectory.cleanSubtree(INodeDirectory.java:830) at org.apache.hadoop.hdfs.server.namenode.snapshot.DirectorySnapshottableFeature.removeSnapshot(DirectorySnapshottableFeature.java:237) at org.apache.hadoop.hdfs.server.namenode.INodeDirectory.removeSnapshot(INodeDirectory.java:292) at org.apache.hadoop.hdfs.server.namenode.snapshot.SnapshotManager.deleteSnapshot(SnapshotManager.java:321) at org.apache.hadoop.hdfs.server.namenode.FSDirSnapshotOp.deleteSnapshot(FSDirSnapshotOp.java:249) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteSnapshot(FSNamesystem.java:6566) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.deleteSnapshot(NameNodeRpcServer.java:1823) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.deleteSnapshot(ClientNamenodeProtocolServerSideTranslatorPB.java:1200) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1007) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:873) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:819) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1965) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2679) {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org