Return-Path: X-Original-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 13229ECFF for ; Thu, 28 Feb 2013 18:47:14 +0000 (UTC) Received: (qmail 58620 invoked by uid 500); 28 Feb 2013 18:47:14 -0000 Delivered-To: apmail-hadoop-yarn-issues-archive@hadoop.apache.org Received: (qmail 58571 invoked by uid 500); 28 Feb 2013 18:47:13 -0000 Mailing-List: contact yarn-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: yarn-issues@hadoop.apache.org Delivered-To: mailing list yarn-issues@hadoop.apache.org Received: (qmail 58560 invoked by uid 99); 28 Feb 2013 18:47:13 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 28 Feb 2013 18:47:13 +0000 Date: Thu, 28 Feb 2013 18:47:13 +0000 (UTC) From: "Sandy Ryza (JIRA)" To: yarn-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (YARN-417) Add a poller that allows the AM to receive notifications when it is assigned containers MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/YARN-417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13589785#comment-13589785 ] Sandy Ryza commented on YARN-417: --------------------------------- bq. I think if ContainerExitCodes needs to be added then it should be its own jira Will move the container exit codes in a separate JIRA. bq. The helper function would have helped because containers contain information set by 2 entities... The issue is that there is not a ton of information for a helper function to interpret. From what I can tell, The framework only defines two special exit codes, and does not distinguish between OOMs and other kinds of container failures, or between killing a container because it was preempted or because the RM lost track of it. These exit codes are platform independent, and any other exit codes can be both application and platform dependent, so the AMRMClientAsync wouldn't know how to interpret them. As ContainerStatuses coming from the RM are only in the context of container completions, ContainerState provides no extra information. Additional information can sometimes be found in the diagnostics strings, but if the reasons that containers die are to be codified, I don't think it should be done by interpreting strings at the API level. bq. Why is client.start() being called in init? client.stop() is being called in stop(). registerApplicationMaster needs to be called after setting up the RM proxy, which occurs in AMRMClient#start, but before starting the heartbeater, which occurs in AMRMClientAsync#start. Another way to accomplish this would be to move the code in AMRMClientImpl#start to AMRMClientImpl#init, which also seems reasonable to me. A third way would be to call registerApplicationMaster from AMRMClientAsync#start. bq. I am wary of calling back on the heartbeat thread itself. Will add a handling thread. bq. Not waiting for the thread to join()? Why interrupt()? Thread needs to be stopped first so that it stops calling into the client. or else it can call into a client that has already stopped. Good point. My reason was that I've seen this as convention other places in YARN (see NodeStatusUpdaterImpl, for example), and that it would allow stop to be called from onContainerCompleted without deadlock, but with the handling thread, the latter shouldn't be a problem, so I'll change it. > Add a poller that allows the AM to receive notifications when it is assigned containers > --------------------------------------------------------------------------------------- > > Key: YARN-417 > URL: https://issues.apache.org/jira/browse/YARN-417 > Project: Hadoop YARN > Issue Type: Sub-task > Components: api, applications > Affects Versions: 2.0.3-alpha > Reporter: Sandy Ryza > Assignee: Sandy Ryza > Attachments: AMRMClientAsync-1.java, AMRMClientAsync.java, YARN-417-1.patch, YARN-417.patch, YarnAppMaster.java, YarnAppMasterListener.java > > > Writing AMs would be easier for some if they did not have to handle heartbeating to the RM on their own. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira