hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marcelo Elias Del Valle <mvall...@gmail.com>
Subject number of mapper tasks
Date Mon, 28 Jan 2013 15:54:00 GMT
Hello,

    I am using hadoop with TextInputFormat, a mapper and no reducers. I am
running my jobs at Amazon EMR. When I run my job, I set both following
options:
-s,mapred.tasktracker.map.tasks.maximum=10
-jobconf,mapred.map.tasks=10
    When I run my job with just 1 instance, I see it only creates 1 mapper.
When I run my job with 5 instances (1 master and 4 cores), I can see only 2
mapper slots are used and 6 stay open.

     I am trying to figure why I am not being able to run more mappers in
parallel. When I see the logs, I find some messages like these:

INFO org.apache.hadoop.mapred.ReduceTask (main):
attempt_201301281437_0001_r_000003_0 Scheduled 0 outputs (0 slow hosts
and0 dup hosts)
org.apache.hadoop.mapred.ReduceTask (main):
attempt_201301281437_0001_r_000003_0 Need another 1 map output(s)
where 0 is already in progress

    Any hints? They would be highly appreciatted.

Best regards,
-- 
Marcelo Elias Del Valle
http://mvalle.com - @mvallebr

Mime
View raw message