hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Segel <michael_se...@hotmail.com>
Subject Re: Assigning reduce tasks to specific nodes
Date Wed, 28 Nov 2012 15:46:48 GMT
Mappers? Uhm... yes you can do it.
Yes it is non-trivial. 
Yes, it is not recommended. 

I think we talk a bit about this in an InfoQ article written by Boris Lublinsky. 

Its kind of wild when your entire cluster map goes red in ganglia... :-)


On Nov 28, 2012, at 2:41 AM, Harsh J <harsh@cloudera.com> wrote:

> Hi,
> 
> Mapper scheduling is indeed influenced by the getLocations() returned results of the
InputSplit.
> 
> The map task itself does not care about deserializing the location information, as it
is of no use to it. The location information is vital to the scheduler (or in 0.20.2, the
JobTracker), where it is sent to directly when a job is submitted. The locations are used
pretty well here.
> 
> You should be able to control (or rather, influence) mapper placement by working with
the InputSplits, but not strictly so, cause in the end its up to your MR scheduler to do data
local or non data local assignments.
> 
> 
> On Wed, Nov 28, 2012 at 11:39 AM, Hiroyuki Yamada <mogwaing@gmail.com> wrote:
> Hi Harsh,
> 
> Thank you for the information.
> I understand the current circumstances.
> 
> How about for mappers ?
> As far as I tested, location information in InputSplit is ignored in 0.20.2,
> so there seems no easy way for assigning mappers to specific nodes.
> (I before checked the source and noticed that
> location information is not restored when deserializing the InputSplit
> instance.)
> 
> Thanks,
> Hiroyuki
> 
> On Wed, Nov 28, 2012 at 2:08 PM, Harsh J <harsh@cloudera.com> wrote:
> > This is not supported/available currently even in MR2, but take a look at
> > https://issues.apache.org/jira/browse/MAPREDUCE-199.
> >
> >
> > On Wed, Nov 28, 2012 at 9:34 AM, Hiroyuki Yamada <mogwaing@gmail.com> wrote:
> >>
> >> Hi,
> >>
> >> I am wondering how I can assign reduce tasks to specific nodes.
> >> What I want to do is, for example,  assigning reducer which produces
> >> part-00000 to node xxx000,
> >> and part-00001 to node xxx001 and so on.
> >>
> >> I think it's abount task assignment scheduling but
> >> I am not sure where to customize to achieve this.
> >> Is this done by writing some extensions ?
> >> or any easier way to do this ?
> >>
> >> Regards,
> >> Hiroyuki
> >
> >
> >
> >
> > --
> > Harsh J
> 
> 
> 
> -- 
> Harsh J


Mime
View raw message