hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Arindam Choudhury <arindamchoudhu...@gmail.com>
Subject Running terasort with 1 map task
Date Tue, 26 Feb 2013 11:09:29 GMT
Hi all,

I am trying to run terasort using one map and one reduce. so, I generated
the input data using:

hadoop jar hadoop-examples-1.0.4.jar teragen -Dmapred.map.tasks=1
-Dmapred.reduce.tasks=1 32000000 /user/hadoop/input32mb1map

Then I launched the hadoop terasort job using:

hadoop jar hadoop-examples-1.0.4.jar terasort -Dmapred.map.tasks=1
-Dmapred.reduce.tasks=1 /user/hadoop/input32mb1map /user/hadoop/output1

I thought it will run the job using 1 map and 1 reduce, but when inspect
the job statistics I found:

hadoop job -history /user/hadoop/output1

Task Summary
============================
Kind    Total    Successful    Failed    Killed    StartTime    FinishTime

Setup    1    1        0    0    26-Feb-2013 10:57:47    26-Feb-2013
10:57:55 (8sec)
Map    24    24        0    0    26-Feb-2013 10:57:57    26-Feb-2013
11:05:37 (7mins, 40sec)
Reduce    1    1        0    0    26-Feb-2013 10:58:21    26-Feb-2013
11:08:31 (10mins, 10sec)
Cleanup    1    1        0    0    26-Feb-2013 11:08:32    26-Feb-2013
11:08:36 (4sec)
============================

so, though I mentioned to launch one map tasks, there are 24 of them.

How to solve this problem. How to tell hadoop to launch only one map.

Thanks,

Mime
View raw message