hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gopal Gandhi <gopal.gandhi2...@yahoo.com>
Subject Re: how to increase number of reduce tasks
Date Thu, 31 Jul 2008 19:33:55 GMT
I think this is due to the input data of your job exist on one node. Mappers are launched only
on nodes with data (Hadoop calls it "block").
For reducer, I am not sure why there's only 1 reducer. Anybody can explain that?

----- Original Message ----
From: Alexander Aristov <alexander.aristov@gmail.com>
To: core-user@hadoop.apache.org
Sent: Thursday, July 31, 2008 12:06:59 PM
Subject: how to increase number of reduce tasks


I am running nutch on hadoop 0.17.1. I launch 5 nodes to perform crawling.

When I look at the job statistics I see that only 1 reduce task is stared
for all steps and hence I do a conclusion that hadoop doesn't consume all
available resources.

Only one node is extremily busy, other nodes are idle. How can I configure
hadoop to consume all resources?

I added mapred.map.tasks and mapred.reduce.tasks parameters but they have no
I also increased the max number for the mapred tasks, job tracker shows it.

During all stages map tasks  reaches maximum 3, andreduce only 1.

Best Regards
Alexander Aristov

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message