hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Miles Osborne <mi...@inf.ed.ac.uk>
Subject Re: hadoop performance with very small cluster
Date Thu, 21 May 2009 07:30:09 GMT
if you mean "hadoop does not give a speed-up compared with a
sequential version" then this is because of overhead associated with
running the framework:  your job will need to be scheduled, JVMs
instantiated, data copied, data sorted etc etc.

if your jobs can be parallelised and you have enough machines (your
cluster is large enough) then the ability to use more machines should
compensate for the framework overhead.

even if your sequential / hacked version running on a small cluster
beats the hadoop version, in my mind a major advantage of Hadoop (and
this is something that people tend to forget) is that your Hadoop
version almost certainly will be simpler and easier to maintain.


2009/5/21 zhu hui <chinazhuhui04@gmail.com>:
> hello, everybody.
> i am fresh to hadoop, and i heard from others that hadoop performs not
> efficient when the cluster is very small,for example 6 machines.
> but i cannot find out the reasons and materials that i can make them as the
> proofs.
> thanks very much if anybody who can share me with some materials or ideas.
> Best Wishes.
> Eric.Syu
> --
> Nothing Impossible

The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

View raw message