hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zhijie Shen (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-2209) Replace AM resync/shutdown command with corresponding exceptions
Date Tue, 29 Jul 2014 09:39:39 GMT

    [ https://issues.apache.org/jira/browse/YARN-2209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14077554#comment-14077554
] 

Zhijie Shen commented on YARN-2209:
-----------------------------------

While the change will neither break the binary and the source compatibility, the logic is
still at the risk of being broken by changing the way of signaling AM from via AMCommand to
via exception. As is mentioned above applications other than MR will be affected by this change.
For example, if a certain application AM logic looks as follows:

{code}
  try {
    ams.allocate(...);
  catch (Exception e) {
    ams.finishApplicationMaster(...)
  }
  if (response is shutdown/resync) {
    // cleanup and reboot ...
  }
{code}

The original logic is likely to be broken if the application is running on the YARN cluster
after this patch. Previously, the application doesn't expect the shutdown/resync is going
to be notified via exception, and it simply catches the allocate operation failure, and terminate
the application. In this case, the application that should have been retried during RM restarting
in a current YARN cluster is likely to conclude failure (assume killing AM container signal
arrives later than all the aforementioned logic).

In general, the problem is that we previously claim an API is going to throw exception 1,
exception 2 and etc., and we expect users to handle these exceptions. To handle them correctly,
users are supposed to know in what situation the exception is going to be raised either implicitly
or explicitly (in YARN it seems that users had to figure out themselves as we hardly drafted
the javadoc for the exceptions). Lately, we don't change the API method signature. Instead,
we add/modify the situation where the exception is going to be raised, or throw a sub-exception
(in this case) which was not expected before. Hence, the existing API user is likely to be
broken around the newly added/modified exception, as the new stuff may not be taken into consideration
before. Is this considered as a kind of *logic incompatibility*?

> Replace AM resync/shutdown command with corresponding exceptions
> ----------------------------------------------------------------
>
>                 Key: YARN-2209
>                 URL: https://issues.apache.org/jira/browse/YARN-2209
>             Project: Hadoop YARN
>          Issue Type: Improvement
>            Reporter: Jian He
>            Assignee: Jian He
>         Attachments: YARN-2209.1.patch, YARN-2209.2.patch, YARN-2209.3.patch, YARN-2209.4.patch,
YARN-2209.5.patch
>
>
> YARN-1365 introduced an ApplicationMasterNotRegisteredException to indicate application
to re-register on RM restart. we should do the same for AMS#allocate call also.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message