Mailing-List: contact yarn-issues-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: yarn-issues@hadoop.apache.org
Date: Wed, 17 Jul 2013 21:56:48 +0000 (UTC)
From: "Omkar Vinit Joshi (JIRA)" <jira@apache.org>
To: yarn-issues@hadoop.apache.org
Message-ID: <JIRA.12656808.1373355371162.69050.1374098208962@arcas>
In-Reply-To: <JIRA.12656808.1373355371162@arcas>
References: <JIRA.12656808.1373355371162@arcas>
Subject: [jira] [Commented] (YARN-906)
 TestNMClient.testNMClientNoCleanupOnStop fails occasionally
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


    [ https://issues.apache.org/jira/browse/YARN-906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13711683#comment-13711683 ] 

Omkar Vinit Joshi commented on YARN-906:
----------------------------------------

what you are saying above completely makes sense.. That is definitely a problem because of mismatch between dispatcher queue processing events and exec actually launching the thread. We should probably make sure that whole computation of call method is moved inside the try{} catch{} and just in the beginning check for the flag status. For updating flag status we definitely need locking....
an alternative solution which seems most logical to me is that what if we send the same event from the place where we are canceling thread and expect /ignore additional event at KILLING state...didn't thought much about it ..but worth considering an alternative solution...thoughts?
[~vinodkv] what surprises me here is our single dispatcher thread model.. :( we really can see multiple issues if anywhere in between state transition we  have client requests and it does cancel some of the expected code path ...destroying expected state transitions..
btw interesting finding [~zjshen] :)
                
> TestNMClient.testNMClientNoCleanupOnStop fails occasionally
> -----------------------------------------------------------
>
>                 Key: YARN-906
>                 URL: https://issues.apache.org/jira/browse/YARN-906
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Zhijie Shen
>            Assignee: Zhijie Shen
>         Attachments: YARN-906.1.patch
>
>
> See https://builds.apache.org/job/PreCommit-YARN-Build/1435//testReport/org.apache.hadoop.yarn.client.api.impl/TestNMClient/testNMClientNoCleanupOnStop/

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira