hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Edward Capriolo <edlinuxg...@gmail.com>
Subject Re: how do I use multiple reducers in hive?
Date Wed, 19 Jan 2011 18:34:00 GMT
On Wed, Jan 19, 2011 at 12:00 PM, Ajo Fod <ajo.fod@gmail.com> wrote:
> The wiki probably needs to be fixed :
> For 32, buckets, I need to set the following flags.
>
>>>set hive.merge.mapfiles = false;
>>>set mapred.map.tasks=32;
>
> ... the set mapred.reduce.tasks ... is irrelevant.
>
> The query mechanism should ideally set this automatically !!
>
> Cheers,
> -Ajo
>
> On Wed, Jan 19, 2011 at 8:04 AM, Edward Capriolo <edlinuxguru@gmail.com> wrote:
>> On Wed, Jan 19, 2011 at 10:46 AM, Ajo Fod <ajo.fod@gmail.com> wrote:
>>> I've 2 questions:
>>> 1) how to raise the number of reducers?
>>> 2) why are there only 2 bucket files per partition even though I
>>> specified 32 buckets?
>>>
>>>
>>> I've set the following and don't see an increase in the number of reducers.
>>>>>set hive.exec.reducers.max=32;
>>>>>set mapred.reduce.tasks=32;
>>>
>>> Could this be because the jobs are too small?
>>>
>>> I have a feeling that this is the cause for my having only 2 bucket
>>> files in each partition, inspite of specifing 32 buckets.
>>>
>>> -Ajo.
>>>
>>
>> I have never tried it you should use:
>>
>> set hive.enforce.bucketing = true;
>>
>> The number of reducers must equal the number of buckets. This is
>> described in the language manual.
>>
>> http://wiki.apache.org/hadoop/Hive/LanguageManual/DDL/BucketedTables
>>
>

Feel free to update the wiki with the notes that merging map files and
map only jobs may bucket incorrectly.

Mime
View raw message