hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Arun Suresh (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-5292) Support for PAUSED container state
Date Sat, 03 Dec 2016 19:36:58 GMT

    [ https://issues.apache.org/jira/browse/YARN-5292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15718618#comment-15718618
] 

Arun Suresh commented on YARN-5292:
-----------------------------------

Thanks for the patch Hitesh, couple of comments:

* {{ContainerExecutor}}, maybe default behavior should not be to throw an Exception. We should
probably LOG.warn() too.
* {{ContainerImpl}}, In a couple of places, you can maybe collapse a bunch of transitions
like this :

{noformat}
.addTransition(ContainerState.KILLING,
	        ContainerState.KILLING,
	        ContainerEventType.CONTAINER_LAUNCHED)
.addTransition(ContainerState.KILLING,
		ContainerState.KILLING,
		ContainerEventType.PAUSE_CONTAINER)
{noformat}

into 

{noformat}
.addTransition(ContainerState.KILLING,
	        ContainerState.KILLING,
	        EnumSet.of(ContainerEventType.CONTAINER_LAUNCHED,
                                     ContainerEventType.PAUSE_CONTAINER)
{noformat}

* It looks like when a container is REINITIALIZING, and it receives a PAUSE event, you are
killing the container… Think it might be better to re-queue the container somehow in this
case - so the scheduler can restart it when there is available resources.
* I was thinking PAUSED and RESUMING should be notified to the RM as SCHEDULED itself. SCHEDULED
should be used signify that the container allocation is secure, but is not running.


> Support for PAUSED container state
> ----------------------------------
>
>                 Key: YARN-5292
>                 URL: https://issues.apache.org/jira/browse/YARN-5292
>             Project: Hadoop YARN
>          Issue Type: New Feature
>            Reporter: Hitesh Sharma
>            Assignee: Hitesh Sharma
>         Attachments: YARN-5292.001.patch, YARN-5292.002.patch, YARN-5292.003.patch, YARN-5292.004.patch,
yarn-5292.pdf
>
>
> YARN-2877 introduced OPPORTUNISTIC containers, and YARN-5216 proposes to add capability
to customize how OPPORTUNISTIC containers get preempted.
> In this JIRA we propose introducing a PAUSED container state.
> When a running container gets preempted, it enters the PAUSED state, where it remains
until resources get freed up on the node then the preempted container can resume to the running
state.
>  
> One scenario where this capability is useful is work preservation. How preemption is
done, and whether the container supports it, is implementation specific.
> For instance, if the container is a virtual machine, then preempt would pause the VM
and resume would restore it back to the running state.
> If the container doesn't support preemption, then preempt would default to killing the
container. 
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message