edgent-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Victor Dogaru (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (QUARKS-105) Configurable number of application restarts
Date Tue, 05 Apr 2016 22:50:25 GMT

    [ https://issues.apache.org/jira/browse/QUARKS-105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15227299#comment-15227299
] 

Victor Dogaru commented on QUARKS-105:
--------------------------------------

> So would this be specified in the JSONConfig used to submit an application, so a number
of restarts property similar to job name?

Yes. "restartCount" takes a value which specifies the number of times the application would
be restarted if its job becomes unhealthy.
   
> What timeframe is the count over? Is it reset to zero if the application successfully
starts? 

I'm not sure I understand the question.  For the scope of this JIRA task, restartCount time
frame is "forever".
This means, the system will declare the application "dead" (and escalate the failure) after
the application has been restarted "restartCount" times, whether the application fails every
minute or every 5 days.

A subsequent task might further refine the definition, for example, "restartCount" resets
to its original value if the application has run for longer than a predefined duration. 

> What would define a successful start?
Not receiving a job event with health==UNHEALTHY. Again, this definition applies for the scope
of this JIRA task.

One might further refine this, for example the system should not restart an app (escalate
its failure instead) if it executed for less than 10 seconds before being terminated. This
behavior would prevent the system from restarting apps if they crash close to startup time.

> Configurable number of application restarts
> -------------------------------------------
>
>                 Key: QUARKS-105
>                 URL: https://issues.apache.org/jira/browse/QUARKS-105
>             Project: Quarks
>          Issue Type: Sub-task
>          Components: Applications, Runtime
>            Reporter: Victor Dogaru
>              Labels: failure-recovery
>
> This configuration would allow a developer to specify the number of times the system
should attempt to restart an application which had terminated because of an unhandled exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message