apex-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sanjay Pujare <san...@datatorrent.com>
Subject Re: "ExcludeNodes" for an Apex application
Date Wed, 30 Nov 2016 18:58:53 GMT
To me both use cases appear to be generic resource management use cases. For example, a randomly
rebooting node is not good for any purpose esp. long running apps so it is a bit of a stretch
to imagine that these nodes will be acceptable for some batch jobs in Yarn. So such a node
should be marked “Bad” or Unavailable in Yarn itself.

Second use case is also typical anti-affinity use case which ideally should be implemented
in Yarn – Milind’s example can also apply to non-Apex batch jobs. In any case it looks
like Yarn still doesn’t have it (https://issues.apache.org/jira/browse/YARN-1042) so if
Apex needs it we will need to do it ourselves.

On 11/30/16, 10:39 AM, "Munagala Ramanath" <ram@datatorrent.com> wrote:

    But then, what's the solution to the 2 problem scenarios that Milind
    describes ?
    
    Ram
    
    On Wed, Nov 30, 2016 at 10:34 AM, Sanjay Pujare <sanjay@datatorrent.com>
    wrote:
    
    > I think “exclude nodes” and such is really the job of the resource manager
    > i.e. Yarn. So I am not sure taking over some of these tasks in Apex would
    > be very useful.
    >
    > I agree with Amol that apps should be node neutral. Resource management in
    > Yarn together with fault tolerance in Apex should minimize the need for
    > this feature although I am sure one can find use cases.
    >
    >
    > On 11/29/16, 10:41 PM, "Amol Kekre" <amol@datatorrent.com> wrote:
    >
    >     We do have this feature in Yarn, but that applies to all applications.
    > I am
    >     not sure if Yarn has anti-affinity. This feature may be used, but in
    >     general there is danger is an application taking over resource
    > allocation.
    >     Another quirk is that big data apps should ideally be node-neutral.
    > This is
    >     a good idea, if we are able to carve out something where need is app
    >     specific.
    >
    >     Thks
    >     Amol
    >
    >
    >     On Tue, Nov 29, 2016 at 10:00 PM, Milind Barve <milindb@gmail.com>
    > wrote:
    >
    >     > We have seen 2 cases mentioned below, where, it would have been nice
    > if
    >     > Apex allowed us to exclude a node from the cluster for an
    > application.
    >     >
    >     > 1. A node in the cluster had gone bad (was randomly rebooting) and
    > so an
    >     > Apex app should not use it - other apps can use it as they were
    > batch jobs.
    >     > 2. A node is being used for a mission critical app (Could be an Apex
    > app
    >     > itself), but another Apex app which is mission critical should not
    > be using
    >     > resources on that node.
    >     >
    >     > Can we have a way in which, Stram and YARN can coordinate between
    > each
    >     > other to not use a set of nodes for the application. It an be done
    > in 2 way
    >     > s-
    >     >
    >     > 1. Have a list of "exclude" nodes with Stram- when YARN allcates
    > resources
    >     > on either of these, STRAM rejects and gets resources allocated again
    > frm
    >     > YARN
    >     > 2. Have a list of nodes that can be used for an app - This can be a
    > part of
    >     > config. Hwever, I don't think this would be a right way to do so as
    > we will
    >     > need support from YARN as well. Further, this might be difficult to
    > change
    >     > at runtim if need be.
    >     >
    >     > Any thoughts?
    >     >
    >     >
    >     > --
    >     > ~Milind bee at gee mail dot com
    >     >
    >
    >
    >
    >
    



Mime
View raw message