heron-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karthik Ramasamy <kart...@streaml.io>
Subject Re: [Discuss] HealthManager launch switch on container-0
Date Fri, 04 Aug 2017 18:03:46 GMT
That is better. If the setting is in heathmgr.yaml, you could use the cli
to turn it on for desired topologies and
that gives independent control on a per topology level.

cheers
/karthik

On Fri, Aug 4, 2017 at 11:00 AM, Sanjeev Kulkarni <sanjeevrk@gmail.com>
wrote:

> Having a setting enabling clusterwide is indeed one of the desired
> properties, as mentioned by Ashvin's first email. The setting in
> healthmgr.yaml would control that. It would be set to false as default.
> Users interested in trying it out could change that and submit it.
>
>
> On Fri, Aug 4, 2017 at 10:55 AM, Karthik Ramasamy <karthik@streaml.io>
> wrote:
>
> > If we enable at healthmgr.yaml, it becomes cluster wide - which is not
> the
> > desired option. For cli, there are no changes
> > in the code. The config property function is built-in already. All you
> have
> > to do is determine - what should be the key and
> > its value.
> >
> > cheers
> > /karthik
> >
> > On Fri, Aug 4, 2017 at 10:51 AM, Sanjeev Kulkarni <sanjeevrk@gmail.com>
> > wrote:
> >
> > > Enabling health manager doesn't sound like a API. Thus I agree that
> > Config
> > > is not the right place for a setting like this.
> > > I also don't like overloading cli with this. IMO cli is already
> > overloaded
> > > with a bunch of things that it shouldn't be.
> > > Why can't we make this part of healthmgr.yaml itself? Or maybe
> > > heron_internals.yaml?
> > >
> > > On Fri, Aug 4, 2017 at 10:12 AM, Karthik Ramasamy <karthik@streaml.io>
> > > wrote:
> > >
> > > > Ashvin -
> > > >
> > > > Instead of adding a Config API to enable self-healing per topology,
> an
> > > > interested user can enable the config using --config-property during
> > > heron
> > > > submit. For example,
> > > >
> > > > heron submit <cluster-name> --config-property
> > > > "heron.config.topology.healthmanager.mode=enable" <topology-file>
> > > > <topology-class> <topology-name>
> > > >
> > > > The advantage of this approach is that there is no hard coded config
> in
> > > the
> > > > code that will require later removal. Thoughts?
> > > >
> > > > cheers
> > > > /karthik
> > > >
> > > >
> > > > On Fri, Aug 4, 2017 at 8:57 AM, Ashvin A <ashvin@apache.org> wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > We are in the process of merging the core building blocks of the
> > > topology
> > > > > health manager (HM) based on Dhalion. This integration is still
> > > > > experimental and needs to be tested thoroughly. So it is desired
> that
> > > the
> > > > > HM be activated on-demand and remain disabled by default.
> Accordingly
> > > we
> > > > > are proposing the following scheme to launch HM process.
> > > > >
> > > > > We are thinking of satisfying the following constraints:
> > > > >
> > > > >    1. Launch on container-0, colocated with the scheduler and the
> > > metrics
> > > > >    cache.
> > > > >    2. Initially HM will be disabled by default. This means HM
> process
> > > > >    should not be started to avoid any side-effects. Once HM is well
> > > > > tested, a
> > > > >    system wide configuration would enable HM for all topologies
> > > submitted
> > > > >    afterwards.
> > > > >    3. If topology explicitly configure, opt-in, HM will be started
> > and
> > > > take
> > > > >    actions as per the configuration, i.e. healthmgr.yaml
> > > > >    4. Like other Heron processes, executor should manage the HM's
> > life
> > > > > cycle
> > > > >
> > > > > Accordingly we propose the following.
> > > > >
> > > > >    1. Add new Config api to enable self-healing per topology:
> > > > >    Config.enableHealthManager(Topology.HealthManagerMode mode).
> > > Default
> > > > >    value will be "system" to indicate use the system wide
> > > configuration.
> > > > >    2. Add a new config to heron_internal.yaml:
> > > > >    "heron.healthmgr.default.mode". The value will be "disabled".
> > > > >    3. The Scheduler will read the default value of HM mode from the
> > > > >    heron_internals config file, like done in
> > SchedulerMain.setupLogging
> > > > > [3].
> > > > >    It will provide the either the user configured mode value or the
> > > > default
> > > > >    mode value to the executor as a command line argument.
> > > > >    4. Add HM mode to the command like arguments to
> heron_executor.py.
> > > > This
> > > > >    is similar to the executor command line arguments for check
> > pointing
> > > > > [2].
> > > > >    5. The executor will launch HM if mode is not "disabled".
> > > > >    6. Later if the default HM mode value is set to "dryrun" or
> > > > >    "self-healing", HM will be launched for all newly submitted
> > > > topologies.
> > > > >
> > > > >
> > > > > What do you think about this approach?
> > > > >
> > > > > Thanks,
> > > > > Ashvin
> > > > >
> > > > >
> > > > > [1] https://github.com/twitter/heron/pull/2132
> > > > > [2] https://github.com/twitter/heron/blob/master/heron/
> > > > > executor/src/python/
> > > > > heron_executor.py#L58
> > > > > [3] https://github.com/twitter/heron/blob/master/
> > > > > heron/scheduler-core/src/java/com/twitter/heron/scheduler/
> > > > > SchedulerMain.java#L277
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message