Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 16406200D0F for ; Fri, 29 Sep 2017 23:43:28 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 14C561609BC; Fri, 29 Sep 2017 21:43:28 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 5798B1609D1 for ; Fri, 29 Sep 2017 23:43:27 +0200 (CEST) Received: (qmail 76921 invoked by uid 500); 29 Sep 2017 21:43:18 -0000 Mailing-List: contact common-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list common-issues@hadoop.apache.org Received: (qmail 76857 invoked by uid 99); 29 Sep 2017 21:43:17 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 29 Sep 2017 21:43:17 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 602421A42F4 for ; Fri, 29 Sep 2017 21:43:17 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -100.002 X-Spam-Level: X-Spam-Status: No, score=-100.002 tagged_above=-999 required=6.31 tests=[RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id Zhtep-FLIrFA for ; Fri, 29 Sep 2017 21:43:16 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id 43477612D9 for ; Fri, 29 Sep 2017 21:43:13 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 87EF8E25A0 for ; Fri, 29 Sep 2017 21:43:12 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 0F147242D9 for ; Fri, 29 Sep 2017 21:43:11 +0000 (UTC) Date: Fri, 29 Sep 2017 21:43:11 +0000 (UTC) From: "Arun Suresh (JIRA)" To: common-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (HADOOP-13091) DistCp masks potential CRC check failures MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Fri, 29 Sep 2017 21:43:28 -0000 [ https://issues.apache.org/jira/browse/HADOOP-13091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated HADOOP-13091: --------------------------------- Is this still on target for 2.9.0 ? If not, can we we push this out to the next major release ? > DistCp masks potential CRC check failures > ----------------------------------------- > > Key: HADOOP-13091 > URL: https://issues.apache.org/jira/browse/HADOOP-13091 > Project: Hadoop Common > Issue Type: Bug > Affects Versions: 2.7.1 > Reporter: Elliot West > Assignee: Yiqun Lin > Attachments: HADOOP-13091.003.patch, HADOOP-13091.004.patch, HDFS-10338.001.patch, HDFS-10338.002.patch > > > There appear to be edge cases whereby CRC checks may be circumvented when requests for checksums from the source or target file system fail. In this event CRCs could differ between the source and target and yet the DistCp copy would succeed, even when the 'skip CRC check' option is not being used. > The code in question is contained in the method [{{org.apache.hadoop.tools.util.DistCpUtils#checksumsAreEqual(...)}}|https://github.com/apache/hadoop/blob/release-2.7.1/hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/util/DistCpUtils.java#L457] > Specifically this code block suggests that if there is a failure when trying to read the source or target checksum then the method will return {{true}} (i.e. the checksums are equal), implying that the check succeeded. In actual fact we just failed to obtain the checksum and could not perform the check. > {code} > try { > sourceChecksum = sourceChecksum != null ? sourceChecksum : > sourceFS.getFileChecksum(source); > targetChecksum = targetFS.getFileChecksum(target); > } catch (IOException e) { > LOG.error("Unable to retrieve checksum for " + source + " or " > + target, e); > } > return (sourceChecksum == null || targetChecksum == null || > sourceChecksum.equals(targetChecksum)); > {code} > I believe that at the very least the caught {{IOException}} should be re-thrown. If this is not deemed desirable then I believe an option ({{--strictCrc}}?) should be added to enforce a strict check where we require that both the source and target CRCs are retrieved, are not null, and are then compared for equality. If for any reason either of the CRCs retrievals fail then an exception is thrown. > Clearly some {{FileSystems}} do not support CRCs and invocations to {{FileSystem.getFileChecksum(...)}} return {{null}} in these instances. I would suggest that these should fail a strict CRC check to prevent users developing a false sense of security in their copy pipeline. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org For additional commands, e-mail: common-issues-help@hadoop.apache.org