heron-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sanjeev Kulkarni <sanjee...@gmail.com>
Subject Re: [Discuss] HealthManager launch switch on container-0
Date Fri, 04 Aug 2017 18:00:05 GMT
Having a setting enabling clusterwide is indeed one of the desired
properties, as mentioned by Ashvin's first email. The setting in
healthmgr.yaml would control that. It would be set to false as default.
Users interested in trying it out could change that and submit it.


On Fri, Aug 4, 2017 at 10:55 AM, Karthik Ramasamy <karthik@streaml.io>
wrote:

> If we enable at healthmgr.yaml, it becomes cluster wide - which is not the
> desired option. For cli, there are no changes
> in the code. The config property function is built-in already. All you have
> to do is determine - what should be the key and
> its value.
>
> cheers
> /karthik
>
> On Fri, Aug 4, 2017 at 10:51 AM, Sanjeev Kulkarni <sanjeevrk@gmail.com>
> wrote:
>
> > Enabling health manager doesn't sound like a API. Thus I agree that
> Config
> > is not the right place for a setting like this.
> > I also don't like overloading cli with this. IMO cli is already
> overloaded
> > with a bunch of things that it shouldn't be.
> > Why can't we make this part of healthmgr.yaml itself? Or maybe
> > heron_internals.yaml?
> >
> > On Fri, Aug 4, 2017 at 10:12 AM, Karthik Ramasamy <karthik@streaml.io>
> > wrote:
> >
> > > Ashvin -
> > >
> > > Instead of adding a Config API to enable self-healing per topology, an
> > > interested user can enable the config using --config-property during
> > heron
> > > submit. For example,
> > >
> > > heron submit <cluster-name> --config-property
> > > "heron.config.topology.healthmanager.mode=enable" <topology-file>
> > > <topology-class> <topology-name>
> > >
> > > The advantage of this approach is that there is no hard coded config in
> > the
> > > code that will require later removal. Thoughts?
> > >
> > > cheers
> > > /karthik
> > >
> > >
> > > On Fri, Aug 4, 2017 at 8:57 AM, Ashvin A <ashvin@apache.org> wrote:
> > >
> > > > Hi,
> > > >
> > > > We are in the process of merging the core building blocks of the
> > topology
> > > > health manager (HM) based on Dhalion. This integration is still
> > > > experimental and needs to be tested thoroughly. So it is desired that
> > the
> > > > HM be activated on-demand and remain disabled by default. Accordingly
> > we
> > > > are proposing the following scheme to launch HM process.
> > > >
> > > > We are thinking of satisfying the following constraints:
> > > >
> > > >    1. Launch on container-0, colocated with the scheduler and the
> > metrics
> > > >    cache.
> > > >    2. Initially HM will be disabled by default. This means HM process
> > > >    should not be started to avoid any side-effects. Once HM is well
> > > > tested, a
> > > >    system wide configuration would enable HM for all topologies
> > submitted
> > > >    afterwards.
> > > >    3. If topology explicitly configure, opt-in, HM will be started
> and
> > > take
> > > >    actions as per the configuration, i.e. healthmgr.yaml
> > > >    4. Like other Heron processes, executor should manage the HM's
> life
> > > > cycle
> > > >
> > > > Accordingly we propose the following.
> > > >
> > > >    1. Add new Config api to enable self-healing per topology:
> > > >    Config.enableHealthManager(Topology.HealthManagerMode mode).
> > Default
> > > >    value will be "system" to indicate use the system wide
> > configuration.
> > > >    2. Add a new config to heron_internal.yaml:
> > > >    "heron.healthmgr.default.mode". The value will be "disabled".
> > > >    3. The Scheduler will read the default value of HM mode from the
> > > >    heron_internals config file, like done in
> SchedulerMain.setupLogging
> > > > [3].
> > > >    It will provide the either the user configured mode value or the
> > > default
> > > >    mode value to the executor as a command line argument.
> > > >    4. Add HM mode to the command like arguments to heron_executor.py.
> > > This
> > > >    is similar to the executor command line arguments for check
> pointing
> > > > [2].
> > > >    5. The executor will launch HM if mode is not "disabled".
> > > >    6. Later if the default HM mode value is set to "dryrun" or
> > > >    "self-healing", HM will be launched for all newly submitted
> > > topologies.
> > > >
> > > >
> > > > What do you think about this approach?
> > > >
> > > > Thanks,
> > > > Ashvin
> > > >
> > > >
> > > > [1] https://github.com/twitter/heron/pull/2132
> > > > [2] https://github.com/twitter/heron/blob/master/heron/
> > > > executor/src/python/
> > > > heron_executor.py#L58
> > > > [3] https://github.com/twitter/heron/blob/master/
> > > > heron/scheduler-core/src/java/com/twitter/heron/scheduler/
> > > > SchedulerMain.java#L277
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message