hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ji Zhang <>
Subject Fail to Increase Hive Mapper Tasks?
Date Thu, 02 Jan 2014 07:55:52 GMT

I have a managed Hive table, which contains only one 150MB file. I
then do "select count(*) from tbl" to it, and it uses 2 mappers. I
want to set it to a bigger number.

First I tried 'set mapred.max.split.size=8388608;', so hopefully it
will use 19 mappers. But it only uses 3. Somehow it still split the
input by 64MB. I also used 'set dfs.block.size=8388608;', not working

Then I tried a vanilla map-reduce job to do the same thing. It
initially uses 3 mappers, and when I set mapred.max.split.size, it
uses 19. So the problem lies in Hive, I suppose.

I read some of the Hive source code, like CombineHiveInputFormat,
ExecDriver, etc. can't find a clue.

What else settings can I use?

Thanks in advance.


View raw message