hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jun Gong (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (YARN-3389) Avoid race conditions when attempts operate on shared states concurrently
Date Sat, 11 Apr 2015 16:16:12 GMT

     [ https://issues.apache.org/jira/browse/YARN-3389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Jun Gong updated YARN-3389:
---------------------------
    Description: In AttemptFailedTransition, new attempt will get states('justFinishedContainers'
and 'finishedContainersSentToAM') reference from failed attempt. Then these attempts share
the two states(previous attempts also share the two states). Suppose two or more CONTAINER_FINISHED
events for different attempts are handled at the same time, and suppose they ran on same node.
Attempts will update justFinishedContainers's same key's value concurrently. Although 'justFinishedContainers'
is a ConcurrentHashMap, operations on its value 'List<ContainerStatus>' is not atomic,
namely  {code}appAttempt.justFinishedContainers.get(containerFinishedEvent.getNodeId()).add(containerFinishedEvent.getContainerStatus()){code}
is not atomic.  (was: In AttemptFailedTransition, the new attempt will get state('justFinishedContainers'
and 'finishedContainersSentToAM') reference from the failed attempt. Then the two attempts
might operate on these two variables concurrently, e.g. they might update 'justFinishedContainers'
concurrently when they are both handling CONTAINER_FINISHED event.)

> Avoid race conditions when attempts operate on shared states concurrently
> -------------------------------------------------------------------------
>
>                 Key: YARN-3389
>                 URL: https://issues.apache.org/jira/browse/YARN-3389
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: resourcemanager
>    Affects Versions: 2.6.0
>            Reporter: Jun Gong
>            Assignee: Jun Gong
>         Attachments: YARN-3389.01.patch
>
>
> In AttemptFailedTransition, new attempt will get states('justFinishedContainers' and
'finishedContainersSentToAM') reference from failed attempt. Then these attempts share the
two states(previous attempts also share the two states). Suppose two or more CONTAINER_FINISHED
events for different attempts are handled at the same time, and suppose they ran on same node.
Attempts will update justFinishedContainers's same key's value concurrently. Although 'justFinishedContainers'
is a ConcurrentHashMap, operations on its value 'List<ContainerStatus>' is not atomic,
namely  {code}appAttempt.justFinishedContainers.get(containerFinishedEvent.getNodeId()).add(containerFinishedEvent.getContainerStatus()){code}
is not atomic.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message