hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <ha...@cloudera.com>
Subject Re: Which data sets were processed by each tasktracker?
Date Fri, 03 May 2013 09:07:57 GMT
You probably need to be using a release that has
https://issues.apache.org/jira/browse/MAPREDUCE-3678 in it. It will
print the input split onto the task logs, letting you know therefore
what it processed at all times (so long as the input split type, such
as file splits, have intelligible outputs for toString()).

On Fri, May 3, 2013 at 1:44 PM, Agarwal, Nikhil
<Nikhil.Agarwal@netapp.com> wrote:
> Hi,
> I  have a 3-node cluster, with JobTracker running on one machine and
> TaskTrackers on other two. Instead of using HDFS, I have written my own
> FileSystem implementation. I am able to run a MapReduce job on this cluster
> but I am not able to make out from logs or TaskTracker UI, which data sets
> were exactly processed by each of the two slaves.
> Can you please tell me some way to find out what exactly did each of my
> tasktracker do during the entire job execution? I am using Hadoop-1.0.4
> source code.
> Thanks & Regards,
> Nikhil

Harsh J

View raw message