Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 47D74200D1B for ; Thu, 28 Sep 2017 05:47:23 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 4645E1609EB; Thu, 28 Sep 2017 03:47:23 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 3DAD21609CA for ; Thu, 28 Sep 2017 05:47:22 +0200 (CEST) Received: (qmail 78799 invoked by uid 500); 28 Sep 2017 03:47:18 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 78788 invoked by uid 99); 28 Sep 2017 03:47:18 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 28 Sep 2017 03:47:18 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 17B71180D84 for ; Thu, 28 Sep 2017 03:47:18 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.576 X-Spam-Level: * X-Spam-Status: No, score=1.576 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, HTML_OBFUSCATE_10_20=1.162, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-2.8, SPF_PASS=-0.001, URIBL_BLOCKED=0.001, URI_HEX=1.313, WEIRD_PORT=0.001] autolearn=disabled Authentication-Results: spamd3-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id hT6AcZOfqlbo for ; Thu, 28 Sep 2017 03:47:13 +0000 (UTC) Received: from mail-pf0-f169.google.com (mail-pf0-f169.google.com [209.85.192.169]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id C413A5FD48 for ; Thu, 28 Sep 2017 03:47:12 +0000 (UTC) Received: by mail-pf0-f169.google.com with SMTP id g65so188357pfe.13 for ; Wed, 27 Sep 2017 20:47:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:from:date:message-id:subject:to; bh=xfZTx33z87WlyueKW8cf7xTdn8oLugzUS9hM4vepWqY=; b=iIRuZYmDJTj3biGXq/pV3U+mk34n0Z9+h1W5AHksRfCeF5mdX3LG1vv7B7Cz1VPmgn cgZ2TUeth/KKl1LRl+u69ZbE8/xFm8DKswJt/CyRL6sgGr5/B47shO2WX27VPaT/Oxo3 gWGJj912iVCo4jSLoVBxGlYzz+yFMpqevChhYlVIC/P4M4f/9ysHcq7kJDD2Cz3krMpe VySTQMJ/o2onLnB/Upq0eXaVDuUiHMdgP3zNp1rnOOXjcvPHfmh9WEJp911JUDRMRYzR 7zsrQ3106VbeixG5OHxjzyIejbimwiwtGCNeD9J4IwCpHlVAvYtsTlMpf+kfdg9TEuuh 9iOA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=xfZTx33z87WlyueKW8cf7xTdn8oLugzUS9hM4vepWqY=; b=AFjAwgG2BUBxcLHDKC0gLlyPZBAzMpCy8MLsRGtf9BTytfg+Q9hgFqGiqte+fsjf++ i7kkbKDf4UssPBr12vJj7xKifbIE1YntuS5YZmiUHYg421QiDeBG3CgLFGNBUsXecR/D mDjGWg3l2fFCw/CpsFiWCZyOipxB6fmc5qq924BFdMAmrCdn7OtYf2iHVSKoTRG+YtpE cQA7yRMGNGHsGeBIdh9q3ojusPZ4nYA54vqH/0eKpFyC4eXctTav+CH55cqh/CaseuZa C7rF1hTeV8X315lZHj35Bo08/GIGLpw1vR2CoRIZdJKKsmMZpE02N7PP+3DqXzeI4+dh BCJg== X-Gm-Message-State: AHPjjUh+N2/Vc5Gh73f4iax9GG2g2EaJGJVTAzKwjlrWtrw8mY44g2zM NEkoqv/cYC7oYkU5vJ2zfmTSN3ziSxUr9mkYxCcNVg== X-Google-Smtp-Source: AOwi7QD/EqJE6gmYDXvTOFRmCzFRE9BeTe82zl0OPv1woLzjTXeja8b/Rlu9z8uFfzY5UCkNcsDpRpm9zo0vp+h8Fdo= X-Received: by 10.98.163.28 with SMTP id s28mr3053511pfe.310.1506570426195; Wed, 27 Sep 2017 20:47:06 -0700 (PDT) MIME-Version: 1.0 Received: by 10.100.144.87 with HTTP; Wed, 27 Sep 2017 20:47:05 -0700 (PDT) From: Demon King Date: Thu, 28 Sep 2017 11:47:05 +0800 Message-ID: Subject: how to set a rpc timeout on yarn application. To: user@hadoop.apache.org Content-Type: multipart/alternative; boundary="001a114389381523e9055a37c34b" archived-at: Thu, 28 Sep 2017 03:47:23 -0000 --001a114389381523e9055a37c34b Content-Type: text/plain; charset="UTF-8" Hi, We have finished a yarn application and deploy it to hadoop 2.6.0 cluster. But if one machine in cluster is down. Our application will hang on NMClientAsyncImpl.stop(). The last log is: 953851 17/09/27 10:46:52 INFO impl.ContainerManagementProtocolProxy: Opening proxy : a14k17307.em21.tbsite.net:8041 953852 17/09/27 10:46:52 INFO impl.ContainerManagementProtocolProxy: Opening proxy : a24a21449.em21.tbsite.net:8041 953853 17/09/27 10:46:52 INFO impl.ContainerManagementProtocolProxy: Opening proxy : e92e09611.em21.tbsite.net:8041 953854 17/09/27 10:46:52 INFO impl.ContainerManagementProtocolProxy: Opening proxy : a24a21477.em21.tbsite.net:8041 953855 17/09/27 10:46:52 INFO impl.ContainerManagementProtocolProxy: Opening proxy : e92e09590.em21.tbsite.net:8041 953856 17/09/27 10:46:52 INFO impl.ContainerManagementProtocolProxy: Opening proxy : a14k17314.em21.tbsite.net:8041 953857 17/09/27 10:46:52 INFO impl.ContainerManagementProtocolProxy: Opening proxy : a24a21474.em21.tbsite.net:8041 953858 17/09/27 10:46:52 INFO impl.ContainerManagementProtocolProxy: Opening proxy : e92e09574.em21.tbsite.net:8041 953859 17/09/27 10:46:52 INFO impl.ContainerManagementProtocolProxy: Opening proxy : e92e09579.em21.tbsite.net:8041 953860 17/09/27 10:46:52 INFO impl.ContainerManagementProtocolProxy: Opening proxy : a14k17332.em21.tbsite.net:8041 953861 17/09/27 10:46:52 INFO impl.ContainerManagementProtocolProxy: Opening proxy : a14k17300.em21.tbsite.net:8041 953862 17/09/27 10:46:52 INFO impl.ContainerManagementProtocolProxy: Opening proxy : e93g17455.em21.tbsite.net:8041 953863 17/09/27 10:46:52 INFO impl.ContainerManagementProtocolProxy: Opening proxy : e92e09595.em21.tbsite.net:8041 953864 17/09/27 10:46:52 INFO impl.ContainerManagementProtocolProxy: Opening proxy : a14k17309.em21.tbsite.net:8041 953865 17/09/27 10:46:52 INFO impl.ContainerManagementProtocolProxy: Opening proxy : a14k17294.em21.tbsite.net:8041 953866 17/09/27 10:46:52 INFO impl.ContainerManagementProtocolProxy: Opening proxy : a14k17316.em21.tbsite.net:8041 953867 17/09/27 10:46:52 INFO impl.ContainerManagementProtocolProxy: Opening proxy : e92e09592.em21.tbsite.net:8041 953868 17/09/27 10:46:52 INFO impl.ContainerManagementProtocolProxy: Opening proxy : a24a21472.em21.tbsite.net:8041 953869 17/09/27 10:46:52 INFO impl.ContainerManagementProtocolProxy: Opening proxy : a14k17337.em21.tbsite.net:8041 953870 17/09/27 10:46:52 INFO impl.ContainerManagementProtocolProxy: Opening proxy : e92e09580.em21.tbsite.net:8041 953871 17/09/27 10:46:52 INFO impl.ContainerManagementProtocolProxy: Opening proxy : e92e09619.em21.tbsite.net:8041 Is any way to set a timeout for NMClientAsyncImpl when call it stop? we init it by : NMClientAsync nmClient = NMClientAsync.createNMClientAsync( nmCallbackHandler); and then : yarnConfig = new YarnConfiguration(); nmClient .init(yarnConfig); nmClient .start(); our code not modify nmClient. --001a114389381523e9055a37c34b Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi,
=C2=A0 =C2=A0 =C2=A0We have finished a yarn application an= d deploy it to hadoop 2.6.0 cluster. But if one machine in cluster is down.= Our application will hang on=C2=A0NMClientAsyncImpl.stop(). The last log i= s:

=C2=A0953851 17/09/27 10:46:52 INFO impl.ContainerManageme= ntProtocolProxy: Opening proxy :=C2=A0a14k17307.em21.tbsite.net:8041
=C2=A0953852 17/09/27 10:46:52 INFO impl.ContainerManagementPro= tocolProxy: Opening proxy :=C2=A0a24a21449.em21.tbsite.net:8041
=C2=A0953853 17/09/27 10:46:52 INFO impl.ContainerManagementProtocol= Proxy: Opening proxy :=C2=A0e92e09611.em21.tbsite.net:8041
=C2= =A0953854 17/09/27 10:46:52 INFO impl.ContainerManagementProtocolProxy: Opening proxy :=C2=A0a24a21477.em21.tbsite.net:8041
=C2=A095= 3855 17/09/27 10:46:52 INFO impl.ContainerManagementProtocolProxy= : Opening proxy :=C2=A0e92e09590.em21.tbsite.net:8041
=C2=A0953856 = 17/09/27 10:46:52 INFO impl.ContainerManagementProtocolProxy: Ope= ning proxy :=C2=A0a14k17314.em21.tbsite.net:8041
=C2=A0953857 17/09= /27 10:46:52 INFO impl.ContainerManagementProtocolProxy: Opening = proxy :=C2=A0a24a21474.em21.tbsite.net:8041
=C2=A0953858 17/09/27 1= 0:46:52 INFO impl.ContainerManagementProtocolProxy: Opening proxy= :=C2=A0e92e09574.em21.tbsite.net:8041
=C2=A0953859 17/09/27 10:46:= 52 INFO impl.ContainerManagementProtocolProxy: Opening proxy :=C2= =A0e92= e09579.em21.tbsite.net:8041
=C2=A0953860 17/09/27 10:46:52 IN= FO impl.ContainerManagementProtocolProxy: Opening proxy :=C2=A0a14k1733= 2.em21.tbsite.net:8041
=C2=A0953861 17/09/27 10:46:52 INFO im= pl.ContainerManagementProtocolProxy: Opening proxy :=C2=A0a14k17300.em2= 1.tbsite.net:8041
=C2=A0953862 17/09/27 10:46:52 INFO impl.ContainerManagementProtocolProxy: Opening proxy :=C2=A0e93g17455.em21.tbs= ite.net:8041
=C2=A0953863 17/09/27 10:46:52 INFO impl.Co= ntainerManagementProtocolProxy: Opening proxy :=C2=A0e92e09595.em21.tbsite.n= et:8041
=C2=A0953864 17/09/27 10:46:52 INFO impl.Contain= erManagementProtocolProxy: Opening proxy :=C2=A0a14k17309.em21.tbsite.net:80= 41
=C2=A0953865 17/09/27 10:46:52 INFO impl.ContainerMan= agementProtocolProxy: Opening proxy :=C2=A0a14k17294.em21.tbsite.net:8041
=C2=A0953866 17/09/27 10:46:52 INFO impl.ContainerManageme= ntProtocolProxy: Opening proxy :=C2=A0a14k17316.em21.tbsite.net:8041
=C2=A0953867 17/09/27 10:46:52 INFO impl.ContainerManagementPro= tocolProxy: Opening proxy :=C2=A0e92e09592.em21.tbsite.net:8041
=C2=A0953868 17/09/27 10:46:52 INFO impl.ContainerManagementProtocol= Proxy: Opening proxy :=C2=A0a24a21472.em21.tbsite.net:8041
=C2= =A0953869 17/09/27 10:46:52 INFO impl.ContainerManagementProtocolProxy: Opening proxy :=C2=A0a14k17337.em21.tbsite.net:8041
=C2=A095= 3870 17/09/27 10:46:52 INFO impl.ContainerManagementProtocolProxy= : Opening proxy :=C2=A0e92e09580.em21.tbsite.net:8041
=C2=A0953871 = 17/09/27 10:46:52 INFO impl.ContainerManagementProtocolProxy: Ope= ning proxy :=C2=A0e92e09619.em21.tbsite.net:8041

Is any way to set = a timeout for NMClientAsyncImpl when call it stop?

we init it by :
=C2=A0 =C2=A0 =C2=A0 =C2=A0NMClientAsync n= mClient =3D NMClientAsync.createNMClientAsync(nmCallbackHandler);=

and then :

=C2=A0 =C2=A0 =C2=A0 =C2=A0 yarnConfig =3D new=C2=A0Y= arnConfiguration();
=C2=A0 =C2=A0 =C2= =A0 =C2=A0 nmClient=C2=A0.init(yarnConfig);
=C2=A0 =C2=A0 =C2=A0 =C2=A0 nmClient=C2=A0.start();
=C2=A0 =C2=A0 =C2=A0=C2=A0
our code not modify nmClient.
--001a114389381523e9055a37c34b--