hadoop-hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Namit Jain (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HIVE-1585) Customizable merge output size
Date Mon, 23 Aug 2010 22:38:15 GMT

    [ https://issues.apache.org/jira/browse/HIVE-1585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12901638#action_12901638

Namit Jain commented on HIVE-1585:

  <description>Size of merged files at the end of the job</description>

  <description>When the average output file size of a job is less than this number,
Hive will start an additional map-reduce job to merge the output files into bigger files.
 This is only done for map-only jobs if hive.merge.mapfiles is true, and for map-reduce jobs
if hive.merge.mapredfiles is true.</description>

Don't the above parameters meet your criteria ?

> Customizable merge output size
> ------------------------------
>                 Key: HIVE-1585
>                 URL: https://issues.apache.org/jira/browse/HIVE-1585
>             Project: Hadoop Hive
>          Issue Type: Improvement
>            Reporter: Ning Zhang
> Currently if hive.merge.[mapfiles|mapredfiles] is true and the merged output file size
is determined by the input split size which is determined by mapred.min.split.size, mapred.min.split.size.per.[node|rack]
and mapred.max.split.size. Sometimes it is desirable to have different output file size than
the input split size. 

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message