hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bryan A. P. Pendleton" ...@geekdom.net>
Subject Re: Loadbalancing
Date Wed, 10 Jan 2007 19:57:08 GMT
Well, there are varying ways of "loadbalancing" that might apply. Right now,
you have to set how many jobs you want run per machine - if you have a wide
range of different numbers of CPUs (I have a cluster with machines with
everything from 1 CPU to 4 CPUs), this can matter.

However, the way map/reduce works, within the number of task instances you
want run *simultaneously*, you get a certain amount of automatic load
balancing if your jobs are large and there are many more tasks than
machines. In that case, fast machines will complete more tasks than slower
machines, effectively balancing load across your cluster. In my cluster,
some of the machines end up completing 2x-3x as many tasks as other, slower
machines, while they all keep busy for most of the run of the job.

On 1/10/07, Sarvesh Singh <serveshp@gmail.com> wrote:
> Can somebody reply to my earlier email (below) also ?
> How does it do loadbalancing by default? Since I will be having different
> configuration machines,
> I want to see the best CPU usage of each machines.
> Thanks in advance!
> Servesh
> On 1/8/07, Sarvesh Singh <serveshp@gmail.com> wrote:
> >
> > Hi,
> >
> > I want loadbalancing to happen when map/reduce tasks get distributed
> over
> > hadoop cluster.
> > I have different configuration machines and want to utilize the cpu
> well.
> > How can I do that?
> >
> > Thanks
> > Servesh
> >
> >

Bryan A. P. Pendleton
Ph: (877) geek-1-bp

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message