hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chen Song <chen.song...@gmail.com>
Subject Re: size of RCFile in hive
Date Wed, 26 Sep 2012 21:39:39 GMT
As far as I know, the number of files emitted would be determined by the
number of mappers for a map only job and the number of reducers for a map
reduce job.

So it totally depends how your query translates into a MR job.

You can enforce it by setting the property

*mapred.reduce.tasks=1*

Chen

On Wed, Sep 19, 2012 at 11:25 PM, 王锋 <wfeng1982@163.com> wrote:

> Hi
>    I tried to convert and merge many small text files using RCFiles using
> hivesql,but hive  produced some small rcfiles.
> set hive.exec.compress.output=true;
> set mapred.output.compress=true;
> set mapred.output.compression.codec=com.hadoop.compression.lzo.LzoCodec;
> set io.compression.codecs=com.hadoop.compression.lzo.LzoCodec;
> hive.merge.mapfiles=true
> hive.merge.mapredfiles=true
> hive.merge.size.per.task=640000000
> hive.merge.size.smallfiles.avgsize=80000000
> insert  overwrite table rctable select .....
>
>
>   the settings:
> hive.merge.mapfiles=true
> hive.merge.mapredfiles=true
> hive.merge.size.per.task=640000000
> hive.merge.size.smallfiles.avgsize=80000000
> didn't work.
>
>
> who could tell me how to solve it?




-- 
Chen Song

Mime
View raw message