hadoop-hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ning Zhang (JIRA)" <j...@apache.org>
Subject [jira] Created: (HIVE-1307) More generic and efficient merge method
Date Wed, 14 Apr 2010 18:36:49 GMT
More generic and efficient merge method
---------------------------------------

                 Key: HIVE-1307
                 URL: https://issues.apache.org/jira/browse/HIVE-1307
             Project: Hadoop Hive
          Issue Type: New Feature
    Affects Versions: 0.6.0
            Reporter: Ning Zhang
            Assignee: Ning Zhang
             Fix For: 0.6.0


Currently if hive.merge.mapfiles/mapredfiles=true, a new mapreduce job is create to read the
input files and output to one reducer for merging. This MR job is created at compile time
and one MR job for one partition. In the case of dynamic partition case, multiple partitions
could be created at execution time and generating merging MR job at compile time is impossible.


We should generalize the merge framework to allow multiple partitions and most of the time
a map-only job should be sufficient if we use CombineHiveInputFormat. 

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message