hama-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Micle Bu <micle...@gmail.com>
Subject Re: Data locality in Hama
Date Wed, 17 Apr 2013 15:46:31 GMT
Thanks Suraj!

Good, another way is to select the groom server that has most blocks/parts
of split(when splitSize > HDFS blocksize, split may has several HDFS
blocks) to obtain maximum locality!
Taking hostname in different rack is another good hint.

Micle Bu


On Wed, Apr 17, 2013 at 11:29 PM, Suraj Menon <surajsmenon@apache.org>wrote:

> Good catch! Yes the logic is to find the first groom server that has the
> split and has available slots for execution.
> You might note that depending on the HDFS allocation, this hostname might
> not be in the same rack. You are welcome to fix this.
>
>
> On Wed, Apr 17, 2013 at 11:15 AM, Micle Bu <micle.bu@gmail.com> wrote:
>
> > Hi all,
> >
> > I'm learning data locality in Hama, and found there is a
> > BestEffortDataLocalTaskAllocator class for this purpose. It's a good idea
> > to assign task to the groom which contains its split,
> getGroomToSchedule()
> > play this role.
> >
> > Well, in getGroomToSchedule(), the code like:
> >
> > GroomServerStatus groom = grooms.get(location);
> > ...
> > if (taskInGroom < groom.getMaxTasks() &&
> > location.equals(groom.getGroomHostName())) {
> >         return groom.getGroomHostName();
> > }
> >
> > It seems that location.equals(groom.getGroomHostName() is always true, so
> > it just select the first groom which contains split? Am i right?
> >
> > Thanks in advance!
> >
> > Micle Bu
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message