hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Daniel,Wu" <hadoop...@163.com>
Subject Re:RE: Why a sql only use one map task?
Date Wed, 24 Aug 2011 06:43:55 GMT
I checked my setting, all are with the default value.So per the book of "Hadoop the definitive
guide", the split size should be 64M. And the file size is about 500M, so that's about 8 splits.
And from the map job information (after the map job is done), I can see it gets 8 split from
one node. But anyhow it starts only one map task.




At 2011-08-24 02:28:18,"Aggarwal, Vaibhav" <vaggarw@amazon.com> wrote:


If you actually have splittable files you can set the following setting to create more splits:

 

mapred.max.split.size appropriately.

 

Thanks

Vaibhav

 

From: Daniel,Wu [mailto:hadoop_wu@163.com]
Sent: Tuesday, August 23, 2011 6:51 AM
To: hive
Subject: Why a sql only use one map task?

 

  I run the following simple sql
select count(*) from sales;
And the job information shows it only uses one map task.

The underlying hadoop has 3 data/data nodes. So I expect hive should kick off 3 map tasks,
one on each task nodes. What can make hive only run one map task? Do I need to set something
to kick off multiple map task?  in my config, I didn't change hive config.

 
Mime
View raw message