hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dieter De Witte <drdwi...@gmail.com>
Subject Re: Performance
Date Mon, 24 Feb 2014 15:56:11 GMT
Hi,

The terasort benchmark is probably the most common. It has mappers and
reducers doing 'nothing', this way you only use the framework's mergesort
functionalities.

Regards, Dieter


2014-02-24 16:42 GMT+01:00 Thomas Bentsen <th@bentzn.com>:

> Hi everyone
>
> I am still beginning Hadoop.
> Is there any benchmarks or 'performance heuristics' for Hadoop?
> Is it possible to say something like 'You can process X lines of GZipped
> log file on a medium AWS server in Y minutes"? I would like to get an
> idea of what kind of workflow is possible.
>
> Thanks in advance
>
> Thomas Bentsen
>
>

Mime
View raw message