hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brian Stempin <bstem...@rightaction.com>
Subject Re: Performance
Date Tue, 25 Feb 2014 20:09:34 GMT
Part of the problem is the word, "process."  That could be really
complicated or really easy.  It could also be done in Java or some other
language via the streaming JAR.

It's hard for anyone to say without more details.  Even with more details,
its still pretty hard to say.

Brian


On Mon, Feb 24, 2014 at 1:22 PM, Thomas Bentsen <th@bentzn.com> wrote:

> Thanks Dieter!
> I'll look into it.
>
> Still... It would be nice to hear something from the real world. Would
> any of you working with Hadoop in a prod env be willing to share
> something?
>
> /th
>
>
>
>
> On Mon, 2014-02-24 at 16:56 +0100, Dieter De Witte wrote:
> > Hi,
> >
> > The terasort benchmark is probably the most common. It has mappers and
> > reducers doing 'nothing', this way you only use the framework's
> > mergesort functionalities.
> >
> >
> > Regards, Dieter
> >
> >
> >
> > 2014-02-24 16:42 GMT+01:00 Thomas Bentsen <th@bentzn.com>:
> >         Hi everyone
> >
> >         I am still beginning Hadoop.
> >         Is there any benchmarks or 'performance heuristics' for
> >         Hadoop?
> >         Is it possible to say something like 'You can process X lines
> >         of GZipped
> >         log file on a medium AWS server in Y minutes"? I would like to
> >         get an
> >         idea of what kind of workflow is possible.
> >
> >         Thanks in advance
> >
> >         Thomas Bentsen
> >
> >
> >
>
>
>

Mime
View raw message