heron-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ashvin A <ash...@apache.org>
Subject Re: [Discuss] HealthManager launch switch on container-0
Date Fri, 04 Aug 2017 18:08:46 GMT
I agree, changing the config option is not needed in this case.

Thanks for the suggestions

On Fri, Aug 4, 2017 at 11:03 AM, Karthik Ramasamy <karthik@streaml.io>
wrote:

> That is better. If the setting is in heathmgr.yaml, you could use the cli
> to turn it on for desired topologies and
> that gives independent control on a per topology level.
>
> cheers
> /karthik
>
> On Fri, Aug 4, 2017 at 11:00 AM, Sanjeev Kulkarni <sanjeevrk@gmail.com>
> wrote:
>
>> Having a setting enabling clusterwide is indeed one of the desired
>> properties, as mentioned by Ashvin's first email. The setting in
>> healthmgr.yaml would control that. It would be set to false as default.
>> Users interested in trying it out could change that and submit it.
>>
>>
>> On Fri, Aug 4, 2017 at 10:55 AM, Karthik Ramasamy <karthik@streaml.io>
>> wrote:
>>
>> > If we enable at healthmgr.yaml, it becomes cluster wide - which is not
>> the
>> > desired option. For cli, there are no changes
>> > in the code. The config property function is built-in already. All you
>> have
>> > to do is determine - what should be the key and
>> > its value.
>> >
>> > cheers
>> > /karthik
>> >
>> > On Fri, Aug 4, 2017 at 10:51 AM, Sanjeev Kulkarni <sanjeevrk@gmail.com>
>> > wrote:
>> >
>> > > Enabling health manager doesn't sound like a API. Thus I agree that
>> > Config
>> > > is not the right place for a setting like this.
>> > > I also don't like overloading cli with this. IMO cli is already
>> > overloaded
>> > > with a bunch of things that it shouldn't be.
>> > > Why can't we make this part of healthmgr.yaml itself? Or maybe
>> > > heron_internals.yaml?
>> > >
>> > > On Fri, Aug 4, 2017 at 10:12 AM, Karthik Ramasamy <karthik@streaml.io
>> >
>> > > wrote:
>> > >
>> > > > Ashvin -
>> > > >
>> > > > Instead of adding a Config API to enable self-healing per topology,
>> an
>> > > > interested user can enable the config using --config-property during
>> > > heron
>> > > > submit. For example,
>> > > >
>> > > > heron submit <cluster-name> --config-property
>> > > > "heron.config.topology.healthmanager.mode=enable" <topology-file>
>> > > > <topology-class> <topology-name>
>> > > >
>> > > > The advantage of this approach is that there is no hard coded
>> config in
>> > > the
>> > > > code that will require later removal. Thoughts?
>> > > >
>> > > > cheers
>> > > > /karthik
>> > > >
>> > > >
>> > > > On Fri, Aug 4, 2017 at 8:57 AM, Ashvin A <ashvin@apache.org>
wrote:
>> > > >
>> > > > > Hi,
>> > > > >
>> > > > > We are in the process of merging the core building blocks of
the
>> > > topology
>> > > > > health manager (HM) based on Dhalion. This integration is still
>> > > > > experimental and needs to be tested thoroughly. So it is desired
>> that
>> > > the
>> > > > > HM be activated on-demand and remain disabled by default.
>> Accordingly
>> > > we
>> > > > > are proposing the following scheme to launch HM process.
>> > > > >
>> > > > > We are thinking of satisfying the following constraints:
>> > > > >
>> > > > >    1. Launch on container-0, colocated with the scheduler and
the
>> > > metrics
>> > > > >    cache.
>> > > > >    2. Initially HM will be disabled by default. This means HM
>> process
>> > > > >    should not be started to avoid any side-effects. Once HM is
>> well
>> > > > > tested, a
>> > > > >    system wide configuration would enable HM for all topologies
>> > > submitted
>> > > > >    afterwards.
>> > > > >    3. If topology explicitly configure, opt-in, HM will be started
>> > and
>> > > > take
>> > > > >    actions as per the configuration, i.e. healthmgr.yaml
>> > > > >    4. Like other Heron processes, executor should manage the
HM's
>> > life
>> > > > > cycle
>> > > > >
>> > > > > Accordingly we propose the following.
>> > > > >
>> > > > >    1. Add new Config api to enable self-healing per topology:
>> > > > >    Config.enableHealthManager(Topology.HealthManagerMode mode).
>> > > Default
>> > > > >    value will be "system" to indicate use the system wide
>> > > configuration.
>> > > > >    2. Add a new config to heron_internal.yaml:
>> > > > >    "heron.healthmgr.default.mode". The value will be "disabled".
>> > > > >    3. The Scheduler will read the default value of HM mode from
>> the
>> > > > >    heron_internals config file, like done in
>> > SchedulerMain.setupLogging
>> > > > > [3].
>> > > > >    It will provide the either the user configured mode value
or
>> the
>> > > > default
>> > > > >    mode value to the executor as a command line argument.
>> > > > >    4. Add HM mode to the command like arguments to
>> heron_executor.py.
>> > > > This
>> > > > >    is similar to the executor command line arguments for check
>> > pointing
>> > > > > [2].
>> > > > >    5. The executor will launch HM if mode is not "disabled".
>> > > > >    6. Later if the default HM mode value is set to "dryrun" or
>> > > > >    "self-healing", HM will be launched for all newly submitted
>> > > > topologies.
>> > > > >
>> > > > >
>> > > > > What do you think about this approach?
>> > > > >
>> > > > > Thanks,
>> > > > > Ashvin
>> > > > >
>> > > > >
>> > > > > [1] https://github.com/twitter/heron/pull/2132
>> > > > > [2] https://github.com/twitter/heron/blob/master/heron/
>> > > > > executor/src/python/
>> > > > > heron_executor.py#L58
>> > > > > [3] https://github.com/twitter/heron/blob/master/
>> > > > > heron/scheduler-core/src/java/com/twitter/heron/scheduler/
>> > > > > SchedulerMain.java#L277
>> > > > >
>> > > >
>> > >
>> >
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message