hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chen Song <chen.song...@gmail.com>
Subject Re: Re: size of RCFile in hive
Date Thu, 27 Sep 2012 13:21:54 GMT
You can force reduce phase by adding distribute by or order by clause after
your select query.

On Thu, Sep 27, 2012 at 2:03 PM, 王锋 <wfeng1982@163.com> wrote:

> but it's map only job
>
>
> At 2012-09-27 05:39:39,"Chen Song" <chen.song.82@gmail.com> wrote:
>
> As far as I know, the number of files emitted would be determined by the
> number of mappers for a map only job and the number of reducers for a map
> reduce job.
>
> So it totally depends how your query translates into a MR job.
>
> You can enforce it by setting the property
>
> *mapred.reduce.tasks=1*
>
> Chen
>
> On Wed, Sep 19, 2012 at 11:25 PM, 王锋 <wfeng1982@163.com> wrote:
>
>> Hi
>>    I tried to convert and merge many small text files using RCFiles using
>> hivesql,but hive  produced some small rcfiles.
>> set hive.exec.compress.output=true;
>> set mapred.output.compress=true;
>> set mapred.output.compression.codec=com.hadoop.compression.lzo.LzoCodec;
>> set io.compression.codecs=com.hadoop.compression.lzo.LzoCodec;
>> hive.merge.mapfiles=true
>> hive.merge.mapredfiles=true
>> hive.merge.size.per.task=640000000
>> hive.merge.size.smallfiles.avgsize=80000000
>> insert  overwrite table rctable select .....
>>
>>
>>   the settings:
>> hive.merge.mapfiles=true
>> hive.merge.mapredfiles=true
>> hive.merge.size.per.task=640000000
>> hive.merge.size.smallfiles.avgsize=80000000
>> didn't work.
>>
>>
>> who could tell me how to solve it?
>
>
>
>
> --
> Chen Song
>
>
>
>
>


-- 
Chen Song

Mime
View raw message