Return-Path: X-Original-To: apmail-hadoop-mapreduce-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-mapreduce-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 910AC1162E for ; Tue, 22 Apr 2014 20:56:18 +0000 (UTC) Received: (qmail 62335 invoked by uid 500); 22 Apr 2014 20:56:16 -0000 Delivered-To: apmail-hadoop-mapreduce-issues-archive@hadoop.apache.org Received: (qmail 62241 invoked by uid 500); 22 Apr 2014 20:56:15 -0000 Mailing-List: contact mapreduce-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: mapreduce-issues@hadoop.apache.org Delivered-To: mailing list mapreduce-issues@hadoop.apache.org Received: (qmail 62228 invoked by uid 99); 22 Apr 2014 20:56:15 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 22 Apr 2014 20:56:15 +0000 Date: Tue, 22 Apr 2014 20:56:15 +0000 (UTC) From: "Harish Butani (JIRA)" To: mapreduce-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (MAPREDUCE-5853) ChecksumFileSystem.getContentSummary() including contents for crc files MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/MAPREDUCE-5853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13977402#comment-13977402 ] Harish Butani commented on MAPREDUCE-5853: ------------------------------------------ Thanks to [~brandon li]: - This change was introduced by https://issues.apache.org/jira/browse/HADOOP-8014. - Was fixed in https://issues.apache.org/jira/browse/HADOOP-10425 > ChecksumFileSystem.getContentSummary() including contents for crc files > ------------------------------------------------------------------------ > > Key: MAPREDUCE-5853 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5853 > Project: Hadoop Map/Reduce > Issue Type: Bug > Reporter: Jason Dere > > Trying to track down some differences in Hive statistics between hadoop-1/hadoop-2. It looks like although ChecksumFileSystem.listStatus() filters out CRC files, getContentSummary() falls back to using the FilterFileSystem.getContentSummary() implementation, which calls fs.getContentSummary(). The underlying fs may not have the same filters as the ChecksumFileSystem and so the CRC files can get included in the content summary. -- This message was sent by Atlassian JIRA (v6.2#6252)