hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shrinivas Joshi <jshrini...@gmail.com>
Subject Re: benchmark choices
Date Mon, 21 Feb 2011 20:39:32 GMT
I wonder what companies like Amazon, Cloudera, RackSpace, Facebook, Yahoo
etc. look at for the purpose of benchmarking. I guess GridMix v3 might be of
more interest to Yahoo.

I would appreciate if someone can comment more on this.

Thanks,
-Shrinivas

On Fri, Feb 18, 2011 at 4:50 PM, Konstantin Boudnik <cos@apache.org> wrote:

> On Fri, Feb 18, 2011 at 14:35, Ted Dunning <tdunning@maprtech.com> wrote:
> > I just read the malstone report.  They report times for a Java version
> that
> > is many (5x) times slower than for a streaming implementation.  That
> single
> > fact indicates that the Java code is so appallingly bad that this is a
> very
> > bad benchmark.
>
> Slow Java code? That's funny ;) Running with Hotspot on by any chance?
>
> > On Fri, Feb 18, 2011 at 2:27 PM, Jim Falgout <jim.falgout@pervasive.com
> >wrote:
> >
> >> We use MalStone and TeraSort. For Hive, you can use TPC-H, at least the
> >> data and the queries, if not the query generator. There is a Jira issue
> in
> >> Hive that discusses the TPC-H "benchmark" if you're interested. Sorry, I
> >> don't remember the issue number offhand.
> >>
> >> -----Original Message-----
> >> From: Shrinivas Joshi [mailto:jshrinivas@gmail.com]
> >> Sent: Friday, February 18, 2011 3:32 PM
> >> To: common-user@hadoop.apache.org
> >> Subject: benchmark choices
> >>
> >> Which workloads are used for serious benchmarking of Hadoop clusters? Do
> >> you care about any of the following workloads :
> >> TeraSort, GridMix v1, v2, or v3, MalStone, CloudBurst, MRBench, NNBench,
> >> sample apps shipped with Hadoop distro like PiEstimator, dbcount etc.
> >>
> >> Thanks,
> >> -Shrinivas
> >>
> >>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message