crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dave Beech (JIRA)" <j...@apache.org>
Subject [jira] [Created] (CRUNCH-165) Pipelines should automatically use CombineFileInputFormat where input consists of many small files
Date Wed, 13 Feb 2013 16:46:13 GMT
Dave Beech created CRUNCH-165:
---------------------------------

             Summary: Pipelines should automatically use CombineFileInputFormat where input
consists of many small files
                 Key: CRUNCH-165
                 URL: https://issues.apache.org/jira/browse/CRUNCH-165
             Project: Crunch
          Issue Type: Improvement
          Components: Core
    Affects Versions: 0.4.0
            Reporter: Dave Beech
            Assignee: Josh Wills


Hive had a feature introduced in HIVE-74 whereby CombineFileInputFormat would be used if the
input data consisted of many small files, making the resulting mapreduce jobs more efficient
by giving individual mappers more data to process. This would be a nice feature for Crunch
to have, too.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message