hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bikas Saha (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-763) AMRMClientAsync should stop heartbeating after receiving shutdown from RM
Date Thu, 04 Jul 2013 23:25:48 GMT

    [ https://issues.apache.org/jira/browse/YARN-763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13700377#comment-13700377

Bikas Saha commented on YARN-763:

By the time the callback thread handles the shutdown request, the heartbeat thread may have
already pinged the RM multiple times and we should ideally avoid that. e.g. since each time
the RM will end up sending it a resync/shutdown or might fail it.
Ideally, the heartbeater thread should check the command and stop as needed so that there
are no subsequent heartbeats.

Not quite clear what the test is testing? The thing to be tested is that there should not
be an allocate call made by the heartbeater thread after it has been sent a shutdown command
by the RM. I dont quite see anything that verifies this behavior.

Secondly, there is a lot of probably unnecessary code in the test. I dont think multiple responses
after shutdown or mocking client.getAvailableResources is required.
+    final AllocateResponse response1 = createAllocateResponse(
+        new ArrayList<ContainerStatus>(), allocated1, null);
+    final AllocateResponse response2 = createAllocateResponse(completed1,
+        new ArrayList<Container>(), null);
+    final AllocateResponse shutDownResponse = createAllocateResponse(
+        new ArrayList<ContainerStatus>(), new ArrayList<Container>(), null);
+    shutDownResponse.setAMCommand(AMCommand.AM_SHUTDOWN);
+    TestCallbackHandler callbackHandler = new TestCallbackHandler();
+    final AMRMClient<ContainerRequest> client = mock(AMRMClientImpl.class);
+    when(client.allocate(anyFloat())).thenReturn(shutDownResponse)
+        .thenReturn(response1).thenReturn(response2);
+    when(client.registerApplicationMaster(anyString(), anyInt(), anyString()))
+      .thenReturn(null);
+    when(client.getAvailableResources()).thenAnswer(new Answer<Resource>() {
+      @Override
+      public Resource answer(InvocationOnMock invocation)
+          throws Throwable {
+        // take client lock to simulate behavior of real impl
+        synchronized (client) {
+          Thread.sleep(10);
+        }
+        return null;
+      }
+    });

On a different note, serviceStop() should not call join() on the heartbeater thread. While
serviceStop() blocks on the join() it may be holding onto application locks in its call tree.
The callback thread might be waiting on those locks as it upcalls to the app code. Resulting
in a deadlock. However, we should ensure the JVM is not hung because of any issue on this
thread. So we should mark the callback thread as a daemon so that the JVM exits even if that
thread is running.
> AMRMClientAsync should stop heartbeating after receiving shutdown from RM
> -------------------------------------------------------------------------
>                 Key: YARN-763
>                 URL: https://issues.apache.org/jira/browse/YARN-763
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Bikas Saha
>            Assignee: Xuan Gong
>         Attachments: YARN-763.1.patch, YARN-763.2.patch

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message