hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alejandro Abdelnur (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-5038) old API CombineFileInputFormat missing fixes that are in new API
Date Fri, 01 Mar 2013 22:41:13 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-5038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13591038#comment-13591038
] 

Alejandro Abdelnur commented on MAPREDUCE-5038:
-----------------------------------------------

The backport looks OK. Still, there is something that worries me

{code}
+  @Override
+  protected boolean isSplitable(FileSystem fs, Path file) {
+    final CompressionCodec codec =
+      new CompressionCodecFactory(fs.getConf()).getCodec(file);
+    return codec == null;
+  }
{code}

We should take into account splittable codecs, trunk does take into account and so does 0.22.
I wonder where/why this got drop in Hadoop 1. Any idea?
                
> old API CombineFileInputFormat missing fixes that are in new API 
> -----------------------------------------------------------------
>
>                 Key: MAPREDUCE-5038
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5038
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>    Affects Versions: 1.1.1
>            Reporter: Sandy Ryza
>            Assignee: Sandy Ryza
>         Attachments: MAPREDUCE-5038.patch
>
>
> The following changes patched the CombineFileInputFormat in mapreduce, but neglected
the one in mapred
> MAPREDUCE-1597 enabled the CombineFileInputFormat to work on splittable files
> MAPREDUCE-2021 solved returning duplicate hostnames in split locations
> MAPREDUCE-1806 CombineFileInputFormat does not work with paths not on default FS
> In trunk this is not an issue as the one in mapred extends the one in mapreduce.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message