hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Omkar Vinit Joshi (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-744) Race condition in ApplicationMasterService.allocate .. It might process same allocate request twice resulting in additional containers getting allocated.
Date Fri, 26 Jul 2013 22:01:49 GMT

    [ https://issues.apache.org/jira/browse/YARN-744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13721282#comment-13721282
] 

Omkar Vinit Joshi commented on YARN-744:
----------------------------------------

Thanks [~bikassaha] ...

bq. AllocateResponseWrapper res
how about AllocateResponseLock??

bq. If the wrapper exists then how can the lastResponse be null?
you are right ..now we no longer need this removing it.

yeah the test won't actually be able to simulate the race condition mentioned above. Can't
think of any other test. Attaching it without a test.
                
> Race condition in ApplicationMasterService.allocate .. It might process same allocate
request twice resulting in additional containers getting allocated.
> ---------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: YARN-744
>                 URL: https://issues.apache.org/jira/browse/YARN-744
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: resourcemanager
>            Reporter: Bikas Saha
>            Assignee: Omkar Vinit Joshi
>            Priority: Minor
>         Attachments: MAPREDUCE-3899-branch-0.23.patch, YARN-744-20130711.1.patch, YARN-744-20130715.1.patch,
YARN-744-20130726.1.patch, YARN-744.patch
>
>
> Looks like the lock taken in this is broken. It takes a lock on lastResponse object and
then puts a new lastResponse object into the map. At this point a new thread entering this
function will get a new lastResponse object and will be able to take its lock and enter the
critical section. Presumably we want to limit one response per app attempt. So the lock could
be taken on the ApplicationAttemptId key of the response map object.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message