Return-Path: X-Original-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 59B501091B for ; Mon, 14 Apr 2014 16:54:21 +0000 (UTC) Received: (qmail 61631 invoked by uid 500); 14 Apr 2014 16:54:19 -0000 Delivered-To: apmail-hadoop-yarn-issues-archive@hadoop.apache.org Received: (qmail 61516 invoked by uid 500); 14 Apr 2014 16:54:17 -0000 Mailing-List: contact yarn-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: yarn-issues@hadoop.apache.org Delivered-To: mailing list yarn-issues@hadoop.apache.org Received: (qmail 61479 invoked by uid 99); 14 Apr 2014 16:54:16 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 14 Apr 2014 16:54:16 +0000 Date: Mon, 14 Apr 2014 16:54:16 +0000 (UTC) From: "Xuan Gong (JIRA)" To: yarn-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (YARN-1897) Define SignalContainerRequest and SignalContainerResponse MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/YARN-1897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13968521#comment-13968521 ] Xuan Gong commented on YARN-1897: --------------------------------- [~mingma] bq. For SignalContainerResponse, what is the semantics of isCMDCompleted? If we want to support synchronous signal container call and this flag indicates whether ContainerExecutor has signaled on the container successfully, that will require RM to wait for the response from NM after NM finishes the work; it implies ApplicationClientProtocol's signalContainer method will hold up a RPC handler for some period of time; we can have some time out or rate limiting on signalContainer call to make sure applications won't be able to consume all RM's RPC handlers. If isCMDCompleted means if the command has been submitted to RM successfully, then it is ok; or we can use exception to indicate failure of the request. OK. We should try the best to do it asynchronously. We will reply on node heartbeat to send the container command to related NM. After NM executes the commands, it can send response(whether the cmd is finished successfully) back to RM with the node heartbeat, too. But this will bring us another questions. Because we can not control how much the NM need to execute the cmds and send back to RM, we can not give a detail time on how long the client should wait for the response. Also, we need to consider the RM Restart, RM Failover, etc. To make progress, i think that right now, check whether command has submitted to RM successfully (check whether container is exist or not, whether the container has already been kill, etc), might be fine for now. So, keep isCMDCompleted in SignalContainerResponse ? What do you think ? > Define SignalContainerRequest and SignalContainerResponse > --------------------------------------------------------- > > Key: YARN-1897 > URL: https://issues.apache.org/jira/browse/YARN-1897 > Project: Hadoop YARN > Issue Type: Sub-task > Components: api > Reporter: Ming Ma > > We need to define SignalContainerRequest and SignalContainerResponse first as they are needed by other sub tasks. SignalContainerRequest should use OS-independent commands and provide a way to application to specify "reason" for diagnosis. SignalContainerResponse might be empty. -- This message was sent by Atlassian JIRA (v6.2#6252)