hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Arun Suresh (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-5651) Changes to NMStateStore to persist reinitialization and rollback state
Date Thu, 10 Nov 2016 15:34:58 GMT

    [ https://issues.apache.org/jira/browse/YARN-5651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15654347#comment-15654347

Arun Suresh commented on YARN-5651:

[~jianhe], Wondering what the right approach for this is.

Currently, in the normal container startup flow, if the NM recovery happens *after* the container
start request comes in (the RecoveredContainerStatus == REQUESTED) but *before* the container
is launched (at which point RecoveredContainerStatus == LAUNCHED), the container is just reported
back as killed. If the Container has been launched and the container is active, then the ContainerImpl's
internal state is regenerated using the StartContainerRequest.

I was thinking, similarly, if a re-initialization request (re-init / restart or rollback)
arrives for a container, we just mark in the stateStore as RecoveredContainerStatus == RE_INITIALIZING.
If the NM restarts and recovers before the container has finished re-initializing, then we
just report the container as killed.
If the Container has completed the relaunch, I proposed we:
# we can replace the ContainerImpl's internal state (launchContext, ResourceSet etc.). We
already do this now.
# we also replace the stored StartContainerRequest object, stored in the db, with a new StartContainerRequest
which we create from the ContainerImpl's internal state.

This way, there is no real need to actually store the ReInitializeContainerRequest object
anywhere. Thoughts ?

> Changes to NMStateStore to persist reinitialization and rollback state
> ----------------------------------------------------------------------
>                 Key: YARN-5651
>                 URL: https://issues.apache.org/jira/browse/YARN-5651
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Arun Suresh
>            Assignee: Arun Suresh

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org

View raw message