Return-Path: X-Original-To: apmail-hadoop-mapreduce-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-mapreduce-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id D009411572 for ; Tue, 22 Apr 2014 20:40:27 +0000 (UTC) Received: (qmail 18753 invoked by uid 500); 22 Apr 2014 20:40:18 -0000 Delivered-To: apmail-hadoop-mapreduce-issues-archive@hadoop.apache.org Received: (qmail 18668 invoked by uid 500); 22 Apr 2014 20:40:16 -0000 Mailing-List: contact mapreduce-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: mapreduce-issues@hadoop.apache.org Delivered-To: mailing list mapreduce-issues@hadoop.apache.org Received: (qmail 18648 invoked by uid 99); 22 Apr 2014 20:40:16 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 22 Apr 2014 20:40:16 +0000 Date: Tue, 22 Apr 2014 20:40:16 +0000 (UTC) From: "Jason Dere (JIRA)" To: mapreduce-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Created] (MAPREDUCE-5853) ChecksumFileSystem.getContentSummary() including contents for crc files MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 Jason Dere created MAPREDUCE-5853: ------------------------------------- Summary: ChecksumFileSystem.getContentSummary() including contents for crc files Key: MAPREDUCE-5853 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5853 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Jason Dere Trying to track down some differences in Hive statistics between hadoop-1/hadoop-2. It looks like although ChecksumFileSystem.listStatus() filters out CRC files, getContentSummary() falls back to using the FilterFileSystem.getContentSummary() implementation, which calls fs.getContentSummary(). The underlying fs may not have the same filters as the ChecksumFileSystem and so the CRC files can get included in the content summary. -- This message was sent by Atlassian JIRA (v6.2#6252)