heron-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karthik Ramasamy <kart...@streaml.io>
Subject Re: [Discuss] HealthManager launch switch on container-0
Date Fri, 04 Aug 2017 17:12:30 GMT
Ashvin -

Instead of adding a Config API to enable self-healing per topology, an
interested user can enable the config using --config-property during heron
submit. For example,

heron submit <cluster-name> --config-property
"heron.config.topology.healthmanager.mode=enable" <topology-file>
<topology-class> <topology-name>

The advantage of this approach is that there is no hard coded config in the
code that will require later removal. Thoughts?


On Fri, Aug 4, 2017 at 8:57 AM, Ashvin A <ashvin@apache.org> wrote:

> Hi,
> We are in the process of merging the core building blocks of the topology
> health manager (HM) based on Dhalion. This integration is still
> experimental and needs to be tested thoroughly. So it is desired that the
> HM be activated on-demand and remain disabled by default. Accordingly we
> are proposing the following scheme to launch HM process.
> We are thinking of satisfying the following constraints:
>    1. Launch on container-0, colocated with the scheduler and the metrics
>    cache.
>    2. Initially HM will be disabled by default. This means HM process
>    should not be started to avoid any side-effects. Once HM is well
> tested, a
>    system wide configuration would enable HM for all topologies submitted
>    afterwards.
>    3. If topology explicitly configure, opt-in, HM will be started and take
>    actions as per the configuration, i.e. healthmgr.yaml
>    4. Like other Heron processes, executor should manage the HM's life
> cycle
> Accordingly we propose the following.
>    1. Add new Config api to enable self-healing per topology:
>    Config.enableHealthManager(Topology.HealthManagerMode mode). Default
>    value will be "system" to indicate use the system wide configuration.
>    2. Add a new config to heron_internal.yaml:
>    "heron.healthmgr.default.mode". The value will be "disabled".
>    3. The Scheduler will read the default value of HM mode from the
>    heron_internals config file, like done in SchedulerMain.setupLogging
> [3].
>    It will provide the either the user configured mode value or the default
>    mode value to the executor as a command line argument.
>    4. Add HM mode to the command like arguments to heron_executor.py. This
>    is similar to the executor command line arguments for check pointing
> [2].
>    5. The executor will launch HM if mode is not "disabled".
>    6. Later if the default HM mode value is set to "dryrun" or
>    "self-healing", HM will be launched for all newly submitted topologies.
> What do you think about this approach?
> Thanks,
> Ashvin
> [1] https://github.com/twitter/heron/pull/2132
> [2] https://github.com/twitter/heron/blob/master/heron/
> executor/src/python/
> heron_executor.py#L58
> [3] https://github.com/twitter/heron/blob/master/
> heron/scheduler-core/src/java/com/twitter/heron/scheduler/
> SchedulerMain.java#L277

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message