hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sun, Rui" <rui....@intel.com>
Subject RE: Fail to Increase Hive Mapper Tasks?
Date Fri, 03 Jan 2014 02:51:57 GMT
Hi, You can try set mapred.map.tasks = 19.
It seems that HIVE is using the old Hadoop MapReduce API and so mapred.max.split.size won't
work.

-----Original Message-----
From: Ji Zhang [mailto:zhangji87@gmail.com] 
Sent: Thursday, January 02, 2014 3:56 PM
To: user@hive.apache.org
Subject: Fail to Increase Hive Mapper Tasks?

Hi,

I have a managed Hive table, which contains only one 150MB file. I then do "select count(*)
from tbl" to it, and it uses 2 mappers. I want to set it to a bigger number.

First I tried 'set mapred.max.split.size=8388608;', so hopefully it will use 19 mappers. But
it only uses 3. Somehow it still split the input by 64MB. I also used 'set dfs.block.size=8388608;',
not working either.

Then I tried a vanilla map-reduce job to do the same thing. It initially uses 3 mappers, and
when I set mapred.max.split.size, it uses 19. So the problem lies in Hive, I suppose.

I read some of the Hive source code, like CombineHiveInputFormat, ExecDriver, etc. can't find
a clue.

What else settings can I use?

Thanks in advance.

Jerry
Mime
View raw message