hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Navis (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HIVE-3290) BucketizedHiveInputFormat should support combining files having same bucket number
Date Mon, 23 Jul 2012 08:02:34 GMT
Navis created HIVE-3290:
---------------------------

             Summary: BucketizedHiveInputFormat should support combining files having same
bucket number
                 Key: HIVE-3290
                 URL: https://issues.apache.org/jira/browse/HIVE-3290
             Project: Hive
          Issue Type: Improvement
          Components: Query Processor
    Affects Versions: 0.10.0
            Reporter: Navis
            Assignee: Navis
            Priority: Minor


Current BucketizedHiveInputFormat creates one split per one input file, which could result
too many map tasks. If input files are not so big (make configurable threshold?), combining
files with same bucket number and same input format could help reducing total execution time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message