hadoop-yarn-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From hilfi alkaff <hilfialk...@gmail.com>
Subject Re: Node scheduling in 2.1.x
Date Fri, 06 Sep 2013 23:39:34 GMT
@Sandy: Ok, will do.
@Hitesh: Correct me if I'm wrong, but I think the multi resource proposal
using DRF discussed in the JIRA only handles memory and CPU.


On Fri, Sep 6, 2013 at 6:02 PM, Sandy Ryza <sandy.ryza@cloudera.com> wrote:

> Janne,
> In 2.1.0, a ResourceRequest for node/rack/* is still required, even for
> strict locality requests.  Using AMRMClient makes this a lot easier and is
> the preferred way of submitting resource requests.  Yes, strict locality
> also works for just racks.
>
> Hilfi,
> I'm not aware of an existing JIRA for adding network bandwidth as a
> resource. Filing one would definitely be appreciated.  If you're interested
> in contributing this to Hadoop, it would be helpful to start with a
> design/proposal discussing issues such as what units to use, how it would
> be enforced, any interesting risks, etc.
>
> Thanks!
> -Sandy
>
>
> On Sat, Sep 7, 2013 at 5:10 AM, Janne Valkealahti <
> janne.valkealahti@gmail.com> wrote:
>
> > In terms of strict locality, how the actual in-house functionality
> differs
> > how it was "done" in 2.0.X. You needed to do request for node/rack/* and
> if
> > you got lucky you got the node you wanted. Do you still need to allocate
> > host for node/rack/* or is plain host just fine?
> >
> > Will strict locality also work for allocation for just racks?
> >
> >
> > On Fri, Sep 6, 2013 at 7:52 PM, Hitesh Shah <hitesh@apache.org> wrote:
> >
> > > Have you taken a look at
> https://issues.apache.org/jira/browse/YARN-326?
> > >
> > > -- Hitesh
> > >
> > > On Sep 6, 2013, at 11:02 AM, hilfi alkaff wrote:
> > >
> > > > Thanks for all the replies. I think I have found the relevant codes
> > that
> > > I
> > > > would like to modify. That said, a project that I'm doing now
> requires
> > > > containers to have network bandwidth as one of its resources (In
> > > > Resource.java: it currently only models memory).
> > > >
> > > > Since I'm planning to implement it anyway, I hope to be able to help
> > > > Hadoop's development. However, I could not find the relevant JIRA for
> > > this.
> > > > If you know of an existing ticket that is relevant to the
> > aforementioned
> > > > issue, let me know. If there is none, should I make my changes first
> > (as
> > > > listed http://wiki.apache.org/hadoop/HowToContribute) and get back
> > after
> > > > I'm done with my code?
> > > >
> > > > Thanks in advance.
> > > >
> > > >
> > > > On Fri, Sep 6, 2013 at 6:37 AM, Steve Loughran <
> stevel@hortonworks.com
> > > >wrote:
> > > >
> > > >> worth adding is that this can generate a bias towards affinitive
> > > >> assignment of an apps containers; for the YARN-896 service we've put
> > > >> anti-affinity as a subtask, along with having AM opt to get
> > > notifications
> > > >> if assignments can't be met in a bounded period (or it could just
> > > examine
> > > >> its queue of outstanding requests and reach the same conclusion
> based
> > on
> > > >> when the requests were submitted)
> > > >>
> > > >>
> > > >> On 6 September 2013 07:43, Sandy Ryza <sandy.ryza@cloudera.com>
> > wrote:
> > > >>
> > > >>> That's right.  Nodes keep checking in and, when they do, the
> > > >>> ResourceManager looks for outstanding requests.  This means that
> > > >> assignment
> > > >>> of containers to nodes depends on the order that they heartbeat
in.
> >  If
> > > >>> container requests come in for specific nodes locality is achieved
> > > >> through
> > > >>> delay scheduling - the ResourceManager will wait for a configurable
> > > >> number
> > > >>> of heartbeats before assigning a container to a non-local node.
 If
> > > >> strict
> > > >>> locality is turned on, the ResourceManager will wait indefinitely
> > for a
> > > >>> local node.
> > > >>>
> > > >>> -Sandy
> > > >>>
> > > >>>
> > > >>> On Fri, Sep 6, 2013 at 3:33 PM, hilfi alkaff <
> hilfialkaff@gmail.com>
> > > >>> wrote:
> > > >>>
> > > >>>> I see. What I'm wondering about is; when an application master
> tries
> > > to
> > > >>>> request a container from resource manager, which part of the
code
> in
> > > >> the
> > > >>>> resource manager actually decide which node to fetch this
> container
> > > >> from.
> > > >>>> Is this step being done asynchronously (ie: Nodes keep checking
if
> > > >> there
> > > >>>> are requests from the ResourceManager during the node update
> event?)
> > > >>>>
> > > >>>>
> > > >>>> On Fri, Sep 6, 2013 at 1:22 AM, Sandy Ryza <
> sandy.ryza@cloudera.com
> > >
> > > >>>> wrote:
> > > >>>>
> > > >>>>> Hi Hilfi,
> > > >>>>>
> > > >>>>> Nodes are constantly heartbeating to the ResourceManager.
 A node
> > > >>> update
> > > >>>>> event is triggered each time this happens.
> > > >>>>>
> > > >>>>> -Sandy
> > > >>>>>
> > > >>>>>
> > > >>>>> On Fri, Sep 6, 2013 at 3:20 PM, hilfi alkaff <
> > hilfialkaff@gmail.com>
> > > >>>>> wrote:
> > > >>>>>
> > > >>>>>> Hi,
> > > >>>>>>
> > > >>>>>> I'm trying to trace the code flow on the scheduling
done in
> YARN.
> > I
> > > >>>> would
> > > >>>>>> like to know where the code that does which node to
schedule for
> > > >> the
> > > >>>>> jobs.
> > > >>>>>>
> > > >>>>>> I found the handle() function in the resource manager's
> scheduler
> > > >>> (eg:
> > > >>>>>> CapacityScheduler.java) that handles node update event
which
> then
> > > >>>>> executes
> > > >>>>>> the assignment of containers for that particular node,
but I do
> > not
> > > >>>>>> understand how that node even get chosen.
> > > >>>>>>
> > > >>>>>> If anybody could tell me about a file, function or
module name
> > that
> > > >>>> does
> > > >>>>>> this, that would be extremely helpful.
> > > >>>>>>
> > > >>>>>> --
> > > >>>>>> ~Hilfi Alkaff~
> > > >>>>>>
> > > >>>>>
> > > >>>>
> > > >>>>
> > > >>>>
> > > >>>> --
> > > >>>> ~Hilfi Alkaff~
> > > >>>>
> > > >>>
> > > >>
> > > >> --
> > > >> CONFIDENTIALITY NOTICE
> > > >> NOTICE: This message is intended for the use of the individual or
> > > entity to
> > > >> which it is addressed and may contain information that is
> > confidential,
> > > >> privileged and exempt from disclosure under applicable law. If the
> > > reader
> > > >> of this message is not the intended recipient, you are hereby
> notified
> > > that
> > > >> any printing, copying, dissemination, distribution, disclosure or
> > > >> forwarding of this communication is strictly prohibited. If you have
> > > >> received this communication in error, please contact the sender
> > > immediately
> > > >> and delete it from your system. Thank You.
> > > >>
> > > >
> > > >
> > > >
> > > > --
> > > > ~Hilfi Alkaff~
> > >
> > >
> >
>



-- 
~Hilfi Alkaff~

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message