hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vijay Murthi" <murt...@yahoo-inc.com>
Subject Map Tasks assignment to nodes
Date Fri, 12 May 2006 00:45:56 GMT
I am trying to run 8 map tasks with 2 reduce on 3 machines. Each task
runs on a 6 MB text file and 500 such files. The monitoring page shows
very few number of Map tasks running than intended. Sometimes some nodes
doesn't even get any tasks assigned though there are large number of
files remaining needs to be scheduled for map operation. Is it due to
distributing the files across nodes? In fact, my file system is set to
local.

Some important parameters are listed below
Io.sort.factor=100
Io.sort.mb  = 1000
Io.file.buffer.size = 4096000
Io.bytes.checksum=128

Mapred.map.tasks=16
Mapred.reduce.tasks=2
Mapred.tasktracker.tasks.maximum=4
Mapred.combine.buffer.size=100000


Is there any parameter I am missing to maximize the use of all CPUS? 


Thanks,
VJ





Mime
View raw message