hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Naama Kraus" <naamakr...@gmail.com>
Subject Hadoop tracing
Date Thu, 18 Sep 2008 10:25:20 GMT
Hi,

I am looking for information in the area of Hadoop tracing, instrumentation,
benchmarking and so forth.
What utilities exist ? What's their maturity? Where can I get more info
about them ?

I am curious about statistics on Hadoop behavior (per a typical workload ?
different workloads ?). I am thinking on various metrics such as -
Percentage of  time a Hadoop job spends on the various phases (map, sort &
shuffle, reduce), on I/O, network, framework execution time, user code
execution time ...
Known bottlenecks ?
And whatever else interesting statistics.

Has anyone already measured ? Any documented statistics out there ?

I already encountered various stuff like the X-trace based tracing tool from
Berkeley, Hadoop metrics API, Hadoop instrumentation API (HADOOP-3772),
Hadoop Vaidya (HADOOP-4179), gridmix benchmark.

Does anyone have an input on any of those ?
Anything else I missed ?

Thanks for any direction,
Naama

-- 
oo 00 oo 00 oo 00 oo 00 oo 00 oo 00 oo 00 oo 00 oo 00 oo 00 oo 00 oo 00 oo
00 oo 00 oo
"If you want your children to be intelligent, read them fairy tales. If you
want them to be more intelligent, read them more fairy tales." (Albert
Einstein)

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message