hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexandru Calin <alexandrucali...@gmail.com>
Subject Re: I/O time when reading from HDFS in Hadoop
Date Sat, 11 Jun 2016 17:28:15 GMT

Firstly, thank you for your response.
To be more exactly, I am interested in measuring the time between the
following intervals: [*a*:*{CLI launch & HDFS read}*]--[*b*:*{user defined
map/reduce}*]---[[*c*:*{writing processed data to HDFS}-end of job*]. I
want to measure how setting compression on and off at input and output data
will change the time between a--b--c boundries and ultimately the total
execution time of a map reduce job (a--c). I am using standard benchmarks
like Wordcount.

Thanks again,

On Sat, Jun 11, 2016 at 7:52 PM, Daniel Schulz <danielschulz2005@hotmail.com
> wrote:

> Hello Alexandru,
> So iff you are solely interested in the latencies, why not using the
> Linux' time command from the shell. Just use the Hadoop CLI to get your
> file, try this from several nodes from various racks for differing files
> from your cluster and build a Confidence Interval for the time it took to
> retrieve each file from any node & rack.
> Otherwise, a more holistic approach was to use this project:
> epaulson.github.io/HadoopInternals/benchmarks.html Its Ohio State
> Infiniband benchmark contains latency information on sequential and random
> writes on Read and Write operations and more.
> Hope this helps…
> Kind regards, Daniel.
> Sent from my iPad
> On 11 Jun 2016, at 17:22, Alexandru Calin <alexandrucalin29@gmail.com>
> wrote:
> Hello,
> I would like to measure the time taken for map and reduce when performing
> I/O (reading from HDFS) in Hadoop. I am using Yarn. Hadoop 2.6.0. What are
> the options for that?
> Thanks

View raw message