hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steve Loughran (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-9933) Augment Service model to support starting stopped services
Date Fri, 06 Sep 2013 11:09:52 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-9933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13760127#comment-13760127
] 

Steve Loughran commented on HADOOP-9933:
----------------------------------------

thinking some more, it gets even more complex, as you don't want to allow the following state
flows

create -> stop -> start
create -> init -> stop -> start

Yes, a flag could be added "started", but the pure way to do this an FSM is to have explicit
states "stopped before started" and "stopped after start", where a start is only a valid transition
from the latter.

now, back in HADOOP-3628 service model I did try to separate out started and live; the service
could take itself in and out of LIVE depending on the state of dependencies (DN -> NN,
TT -> JT, JT -> HDFS out of safe mode), along with an explicit FAILED state
[https://github.com/apache/hadoop-common/blob/HADOOP-3628/src/core/org/apache/hadoop/util/Service.java#L391]

It complicated a lot of the logic as now live has two states, as does failed. I also left
it to the service itself to perform the STARTED <--> LIVE transitions, and decide when
it fails. 

For the YARN service model, things are simpler
* LIVE, with the ability to add/remove a list of things you are waiting for (blockers), which
is meant to be there purely for the benefit of management tools. This hasn't been turned on
for anything yet, though I should go through the services and add it, starting with the DN
when we get round to service-modelling it
* STOPPED has an exception; any exception thrown during init & start goes in there, and
anything raised during shutdown (if not already set), though that's just a hint. If we drop
the latter then you can define {{FAILED := STOPPED & !exception}}.


# could we have an explicit active/passive modes for the RM, either purely as part of that
service, or for other things we could take on/offline.
# what about just creating a new service instance on each startup, in the existing process?
This would ensure that the service is cleanly initialised, and we could verify that there
aren't leakages by having a test run that tries to do this a few thousand times.

Option #2 appeals to me if the cost of creation and startup is low enough; if there's lot
of pre-startup initialisation then the ready-to-start instance could be created at the same
time its predecessor is stopped


                
> Augment Service model to support starting stopped services
> ----------------------------------------------------------
>
>                 Key: HADOOP-9933
>                 URL: https://issues.apache.org/jira/browse/HADOOP-9933
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 2.1.0-beta
>            Reporter: Karthik Kambatla
>            Assignee: Karthik Kambatla
>              Labels: service
>
> For ResourceManager-HA (YARN-149 and co), we would want to start/stop/start RM's active
services as it transitions to Active/Standby/Active respectively. In the current service model,
we can't start the services that are already stopped.
> Would be nice to augment this. To avoid accidental restart of stopped services, we can
add another API: start(boolean restartIfStopped). Thoughts?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message