hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From psdc1978 <psdc1...@gmail.com>
Subject hadoop tasktracker
Date Wed, 09 Jun 2010 15:17:59 GMT
Hi,

If I define in mapred-site.xml the property mapred.reduce.tasks to 1, how
many reduce tasks will actually run? I think it will run 2 and I don't know
why. But in a log that I've added, the two constructors of the
ReduceTask.java class will run ( ReduceTask() and ReduceTask(with
parameters) ).

I don't understand why ReduceTask() [with no parameters] willl run. Here's
the stacktrace that I get to understand the thread of execution of this
contructor.
[code]
     java.lang.Exception:
        at org.apache.hadoop.mapred.ReduceTask.<init>(ReduceTask.java:164)
        at
org.apache.hadoop.mapred.LaunchTaskAction.readFields(LaunchTaskAction.java:62)
        at
org.apache.hadoop.mapred.HeartbeatResponse.readFields(HeartbeatResponse.java:137)
        at
org.apache.hadoop.io.ObjectWritable.readObject(ObjectWritable.java:237)
        at
org.apache.hadoop.io.ObjectWritable.readFields(ObjectWritable.java:66)
        at
org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:510)
        at org.apache.hadoop.ipc.Client$Connection.run(Client.java:445)
[/code]

As you can see, the ReduceTask()  comes from the Connection class. Can
anyone explain me what's the purpose of this thread of execution, and what's
the purpose of the Client class?


2 -
A ReduceTask is launched by a TaskTracker in a new child JVM, right?

3 -
A TaskTracker is a thread that can run several map and reduces at the the
same time, right?


Thanks,
-- 
Pedro

Mime
View raw message