hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ji ZHANG <>
Subject Re: Fail to Increase Hive Mapper Tasks?
Date Fri, 03 Jan 2014 04:27:06 GMT
Hi Rui,

I combined your suggestion with the answer from
and it works:

set = 20;
select count(*) from dw_stage.st_dw_marketing_touch_pi_metrics_basic;

It'll use 22 mappers, though I don't know why it's not an exact 20.
And I'm using Hive 0.9 + Hadoop 1.01.

Thank you very much.


On Fri, Jan 3, 2014 at 10:51 AM, Sun, Rui <> wrote:
> Hi, You can try set = 19.
> It seems that HIVE is using the old Hadoop MapReduce API and so mapred.max.split.size
won't work.
> -----Original Message-----
> From: Ji Zhang []
> Sent: Thursday, January 02, 2014 3:56 PM
> To:
> Subject: Fail to Increase Hive Mapper Tasks?
> Hi,
> I have a managed Hive table, which contains only one 150MB file. I then do "select count(*)
from tbl" to it, and it uses 2 mappers. I want to set it to a bigger number.
> First I tried 'set mapred.max.split.size=8388608;', so hopefully it will use 19 mappers.
But it only uses 3. Somehow it still split the input by 64MB. I also used 'set dfs.block.size=8388608;',
not working either.
> Then I tried a vanilla map-reduce job to do the same thing. It initially uses 3 mappers,
and when I set mapred.max.split.size, it uses 19. So the problem lies in Hive, I suppose.
> I read some of the Hive source code, like CombineHiveInputFormat, ExecDriver, etc. can't
find a clue.
> What else settings can I use?
> Thanks in advance.
> Jerry

View raw message