hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <ha...@cloudera.com>
Subject Re: Capturing Map/reduce task run times and bytes read
Date Sat, 03 Dec 2011 09:13:49 GMT

Inline again.

On 03-Dec-2011, at 12:39 PM, arun k wrote:
> Q>Does the map/reduce task run time displayed in web GUI is decent/accurate enough

Don't see why not. We only display what's been genuinely collected. What you get out of an
API on the CLI is absolutely the same thing. Or perhaps I do not understand your question
completely here - what's led you to ask this?

> Q>If i want to do find the IO rate of a task, will the task run time divided by total
number of FIle bytes and HDFS bytes read/written give it approximately ?

Yes, that should give you a stop-watch measure. Task start -> Task end, and the counters
the task puts up for itself.

> Q>Does the FILE Bytes read for the reduce task include the map output record bytes
read non-locally over network or the bytes read locally from the map output records after
they are copied locally ?

FILE counters are from whatever is read off a local filesystem (file:///), so would mean the
latter. If you look again, you will notice another counter named "Reduce shuffle bytes" that
gives you the former count - separately.
View raw message