hadoop-yarn-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sandy Ryza <sandy.r...@cloudera.com>
Subject Re: Node scheduling in 2.1.x
Date Fri, 06 Sep 2013 23:02:55 GMT
Janne,
In 2.1.0, a ResourceRequest for node/rack/* is still required, even for
strict locality requests.  Using AMRMClient makes this a lot easier and is
the preferred way of submitting resource requests.  Yes, strict locality
also works for just racks.

Hilfi,
I'm not aware of an existing JIRA for adding network bandwidth as a
resource. Filing one would definitely be appreciated.  If you're interested
in contributing this to Hadoop, it would be helpful to start with a
design/proposal discussing issues such as what units to use, how it would
be enforced, any interesting risks, etc.

Thanks!
-Sandy


On Sat, Sep 7, 2013 at 5:10 AM, Janne Valkealahti <
janne.valkealahti@gmail.com> wrote:

> In terms of strict locality, how the actual in-house functionality differs
> how it was "done" in 2.0.X. You needed to do request for node/rack/* and if
> you got lucky you got the node you wanted. Do you still need to allocate
> host for node/rack/* or is plain host just fine?
>
> Will strict locality also work for allocation for just racks?
>
>
> On Fri, Sep 6, 2013 at 7:52 PM, Hitesh Shah <hitesh@apache.org> wrote:
>
> > Have you taken a look at https://issues.apache.org/jira/browse/YARN-326?
> >
> > -- Hitesh
> >
> > On Sep 6, 2013, at 11:02 AM, hilfi alkaff wrote:
> >
> > > Thanks for all the replies. I think I have found the relevant codes
> that
> > I
> > > would like to modify. That said, a project that I'm doing now requires
> > > containers to have network bandwidth as one of its resources (In
> > > Resource.java: it currently only models memory).
> > >
> > > Since I'm planning to implement it anyway, I hope to be able to help
> > > Hadoop's development. However, I could not find the relevant JIRA for
> > this.
> > > If you know of an existing ticket that is relevant to the
> aforementioned
> > > issue, let me know. If there is none, should I make my changes first
> (as
> > > listed http://wiki.apache.org/hadoop/HowToContribute) and get back
> after
> > > I'm done with my code?
> > >
> > > Thanks in advance.
> > >
> > >
> > > On Fri, Sep 6, 2013 at 6:37 AM, Steve Loughran <stevel@hortonworks.com
> > >wrote:
> > >
> > >> worth adding is that this can generate a bias towards affinitive
> > >> assignment of an apps containers; for the YARN-896 service we've put
> > >> anti-affinity as a subtask, along with having AM opt to get
> > notifications
> > >> if assignments can't be met in a bounded period (or it could just
> > examine
> > >> its queue of outstanding requests and reach the same conclusion based
> on
> > >> when the requests were submitted)
> > >>
> > >>
> > >> On 6 September 2013 07:43, Sandy Ryza <sandy.ryza@cloudera.com>
> wrote:
> > >>
> > >>> That's right.  Nodes keep checking in and, when they do, the
> > >>> ResourceManager looks for outstanding requests.  This means that
> > >> assignment
> > >>> of containers to nodes depends on the order that they heartbeat in.
>  If
> > >>> container requests come in for specific nodes locality is achieved
> > >> through
> > >>> delay scheduling - the ResourceManager will wait for a configurable
> > >> number
> > >>> of heartbeats before assigning a container to a non-local node.  If
> > >> strict
> > >>> locality is turned on, the ResourceManager will wait indefinitely
> for a
> > >>> local node.
> > >>>
> > >>> -Sandy
> > >>>
> > >>>
> > >>> On Fri, Sep 6, 2013 at 3:33 PM, hilfi alkaff <hilfialkaff@gmail.com>
> > >>> wrote:
> > >>>
> > >>>> I see. What I'm wondering about is; when an application master
tries
> > to
> > >>>> request a container from resource manager, which part of the code
in
> > >> the
> > >>>> resource manager actually decide which node to fetch this container
> > >> from.
> > >>>> Is this step being done asynchronously (ie: Nodes keep checking
if
> > >> there
> > >>>> are requests from the ResourceManager during the node update event?)
> > >>>>
> > >>>>
> > >>>> On Fri, Sep 6, 2013 at 1:22 AM, Sandy Ryza <sandy.ryza@cloudera.com
> >
> > >>>> wrote:
> > >>>>
> > >>>>> Hi Hilfi,
> > >>>>>
> > >>>>> Nodes are constantly heartbeating to the ResourceManager. 
A node
> > >>> update
> > >>>>> event is triggered each time this happens.
> > >>>>>
> > >>>>> -Sandy
> > >>>>>
> > >>>>>
> > >>>>> On Fri, Sep 6, 2013 at 3:20 PM, hilfi alkaff <
> hilfialkaff@gmail.com>
> > >>>>> wrote:
> > >>>>>
> > >>>>>> Hi,
> > >>>>>>
> > >>>>>> I'm trying to trace the code flow on the scheduling done
in YARN.
> I
> > >>>> would
> > >>>>>> like to know where the code that does which node to schedule
for
> > >> the
> > >>>>> jobs.
> > >>>>>>
> > >>>>>> I found the handle() function in the resource manager's
scheduler
> > >>> (eg:
> > >>>>>> CapacityScheduler.java) that handles node update event
which then
> > >>>>> executes
> > >>>>>> the assignment of containers for that particular node,
but I do
> not
> > >>>>>> understand how that node even get chosen.
> > >>>>>>
> > >>>>>> If anybody could tell me about a file, function or module
name
> that
> > >>>> does
> > >>>>>> this, that would be extremely helpful.
> > >>>>>>
> > >>>>>> --
> > >>>>>> ~Hilfi Alkaff~
> > >>>>>>
> > >>>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>> --
> > >>>> ~Hilfi Alkaff~
> > >>>>
> > >>>
> > >>
> > >> --
> > >> CONFIDENTIALITY NOTICE
> > >> NOTICE: This message is intended for the use of the individual or
> > entity to
> > >> which it is addressed and may contain information that is
> confidential,
> > >> privileged and exempt from disclosure under applicable law. If the
> > reader
> > >> of this message is not the intended recipient, you are hereby notified
> > that
> > >> any printing, copying, dissemination, distribution, disclosure or
> > >> forwarding of this communication is strictly prohibited. If you have
> > >> received this communication in error, please contact the sender
> > immediately
> > >> and delete it from your system. Thank You.
> > >>
> > >
> > >
> > >
> > > --
> > > ~Hilfi Alkaff~
> >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message