hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dieter De Witte <drdwi...@gmail.com>
Subject Re: Performance
Date Mon, 24 Feb 2014 15:56:11 GMT

The terasort benchmark is probably the most common. It has mappers and
reducers doing 'nothing', this way you only use the framework's mergesort

Regards, Dieter

2014-02-24 16:42 GMT+01:00 Thomas Bentsen <th@bentzn.com>:

> Hi everyone
> I am still beginning Hadoop.
> Is there any benchmarks or 'performance heuristics' for Hadoop?
> Is it possible to say something like 'You can process X lines of GZipped
> log file on a medium AWS server in Y minutes"? I would like to get an
> idea of what kind of workflow is possible.
> Thanks in advance
> Thomas Bentsen

View raw message