reef-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sanha Lee (JIRA)" <>
Subject [jira] [Commented] (REEF-832) Add CLOSING state to EvaluatorState
Date Tue, 19 Jul 2016 06:08:20 GMT


Sanha Lee commented on REEF-832:

Hello [~MariiaMykhailova]. I took this issue yesterday.

In my understand, there are three conditions, which are `DONE`, `KILLED`, and `FAILED`, in
evaluator closing.
 - At present, for handling `DONE` and `KILLED` condition, `EvaluatorManager` sends evaluator
control message and sets evaluator's state as `DONE` or `KILLED`. After this, `EvaluatorRuntime`
receives this message and shut down itself.
 - And for handling `FAILED` condition, `EvaluatorRuntime` sends hearbeat with failed task
status to `EvaluatorManager` when it experiences an exception and close itself beforehand.
Therefore, I think that `CLOSING` state is not needed in `FAILED` condition, but only needed
in `DONE` and `KILLED` conditions.

And, for adding `CLOSING` state to `DONE` and `KILLED` conditions, I'm considering design
like below.
 - When `EvalautorManager` sends evaluator control message to shut down evaluator, it sets
evaluator's state as `CLOSING_DONE` or `CLOSING_KILLED` instead of `DONE` or `KILLED`.
 - When `EvaluatorRuntime` receives this message, it sends hearbeat with new task status like
'DOWNED' and shut down itself, instead of just shutting down itself.
 - When `EvaluatorManager` receives this message, it sets evaluator's state as `DONE` or `KILLED`.

Please let me know whether my understanding and design is proper or not.

> Add CLOSING state to EvaluatorState
> -----------------------------------
>                 Key: REEF-832
>                 URL:
>             Project: REEF
>          Issue Type: Improvement
>          Components: REEF-Common
>            Reporter: Mariia Mykhailova
>            Assignee: Sanha Lee
>            Priority: Minor
> {{org.apache.reef.runtime.common.driver.evaluator.EvaluatorState}} needs a CLOSING state
to describe the time between asking an Evaluator to shut down and when that has actually happened.
That would allow us to clean up the state checking code.

This message was sent by Atlassian JIRA

View raw message