hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jian He (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (YARN-7486) Race condition in service AM that can cause NPE
Date Tue, 14 Nov 2017 01:32:00 GMT

     [ https://issues.apache.org/jira/browse/YARN-7486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Jian He updated YARN-7486:
--------------------------
    Description: 
1. container1 completed for instance1
2. instance1 is added to pending list
3. container2 allocated, and assigned to instance1, it records the container2 inside instance1
4. in the meantime, instance1 ContainerStoppedTransition is called and that set the container
back to null. 
This cause the recorded container lost.

{code}
		java.lang.NullPointerException
			at org.apache.hadoop.yarn.service.provider.ProviderUtils.initCompTokensForSubstitute(ProviderUtils.java:402)
			at org.apache.hadoop.yarn.service.provider.AbstractProviderService.buildContainerLaunchContext(AbstractProviderService.java:70)
			at org.apache.hadoop.yarn.service.containerlaunch.ContainerLaunchService$ContainerLauncher.run(ContainerLaunchService.java:89)
			at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
			at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
			at java.lang.Thread.run(Thread.java:745)
{code}

  was:
1. container1 completed for instance1
2. instance1 is added to pending list
3. container2 allocated, and assigned to instance1, it records the container2 inside instance1
4. in the meantime, instance1 ContainerStoppedTransition is called and that set the container
back to null. 
This cause the recorded container lost.


> Race condition in service AM that can cause NPE
> -----------------------------------------------
>
>                 Key: YARN-7486
>                 URL: https://issues.apache.org/jira/browse/YARN-7486
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Jian He
>            Assignee: Jian He
>
> 1. container1 completed for instance1
> 2. instance1 is added to pending list
> 3. container2 allocated, and assigned to instance1, it records the container2 inside
instance1
> 4. in the meantime, instance1 ContainerStoppedTransition is called and that set the container
back to null. 
> This cause the recorded container lost.
> {code}
> 		java.lang.NullPointerException
> 			at org.apache.hadoop.yarn.service.provider.ProviderUtils.initCompTokensForSubstitute(ProviderUtils.java:402)
> 			at org.apache.hadoop.yarn.service.provider.AbstractProviderService.buildContainerLaunchContext(AbstractProviderService.java:70)
> 			at org.apache.hadoop.yarn.service.containerlaunch.ContainerLaunchService$ContainerLauncher.run(ContainerLaunchService.java:89)
> 			at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> 			at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> 			at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message