hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Allen Wittenauer ...@apache.org>
Subject Re: Acceptance tests
Date Mon, 16 May 2011 19:46:16 GMT

On May 16, 2011, at 11:03 AM, Evert Lammerts wrote:

> Hi all,
> What acceptance tests are people using when buying clusters for Hadoop? Any pointers
to relevant methods?

	We get some test nodes from various manufacturers.  We do some raw IO benchmarking vs. our
other nodes.  We add them to our various grids to see how they perform real world, paying
attn to avg task time turn around for certain jobs.   Since we know where our current machines
are at, we can look at price per perf improvements.

	Other random things that I think are important:

		a) Unless someone shares their entire *-site.xml data, most published benchmarks on the
net are mostly useless.  Simple things like block size have a big impact.

		b) Test your actual workload.  Synthetic benchmarks are just that--synthetic.  They may
not reflect that particular nuances of your job.

		c) Establish a baseline. If you have no hardware today, then at least establish something
on EC2 to compare.

		d) Make sure you talk to multiple vendors.

		e) Any advice anyone gives you on config is likely going to be wrong.
View raw message