hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Harish Butani (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-5853) ChecksumFileSystem.getContentSummary() including contents for crc files
Date Tue, 22 Apr 2014 20:56:15 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-5853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13977402#comment-13977402

Harish Butani commented on MAPREDUCE-5853:

Thanks to [~brandon li]:
- This change was introduced by https://issues.apache.org/jira/browse/HADOOP-8014.
- Was fixed in https://issues.apache.org/jira/browse/HADOOP-10425

> ChecksumFileSystem.getContentSummary() including contents for crc files 
> ------------------------------------------------------------------------
>                 Key: MAPREDUCE-5853
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5853
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>            Reporter: Jason Dere
> Trying to track down some differences in Hive statistics between hadoop-1/hadoop-2. 
It looks like although ChecksumFileSystem.listStatus() filters out CRC files, getContentSummary()
falls back to using the FilterFileSystem.getContentSummary() implementation, which calls fs.getContentSummary().
 The underlying fs may not have the same filters as the ChecksumFileSystem and so the CRC
files can get included in the content summary.

This message was sent by Atlassian JIRA

View raw message