Return-Path: X-Original-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 4E832D371 for ; Thu, 1 Nov 2012 05:43:33 +0000 (UTC) Received: (qmail 13319 invoked by uid 500); 1 Nov 2012 05:43:33 -0000 Delivered-To: apmail-hadoop-yarn-issues-archive@hadoop.apache.org Received: (qmail 13242 invoked by uid 500); 1 Nov 2012 05:43:33 -0000 Mailing-List: contact yarn-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: yarn-issues@hadoop.apache.org Delivered-To: mailing list yarn-issues@hadoop.apache.org Received: (qmail 13227 invoked by uid 99); 1 Nov 2012 05:43:32 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 01 Nov 2012 05:43:32 +0000 Date: Thu, 1 Nov 2012 05:43:32 +0000 (UTC) From: "Vinod Kumar Vavilapalli (JIRA)" To: yarn-issues@hadoop.apache.org Message-ID: <1389600763.54642.1351748612905.JavaMail.jiratomcat@arcas> Subject: [jira] [Moved] (YARN-196) Nodemanager if started before starting Resource manager is getting shutdown.But if both RM and NM are started and then after if RM is going down,NM is retrying for the RM. MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/YARN-196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli moved MAPREDUCE-3676 to YARN-196: --------------------------------------------------------- Component/s: (was: mrv2) nodemanager Target Version/s: (was: 0.23.0) Affects Version/s: (was: 3.0.0) (was: 2.0.0-alpha) (was: 0.23.2) 3.0.0 2.0.0-alpha Key: YARN-196 (was: MAPREDUCE-3676) Project: Hadoop YARN (was: Hadoop Map/Reduce) > Nodemanager if started before starting Resource manager is getting shutdown.But if both RM and NM are started and then after if RM is going down,NM is retrying for the RM. > --------------------------------------------------------------------------------------------------------------------------------------------------------------------------- > > Key: YARN-196 > URL: https://issues.apache.org/jira/browse/YARN-196 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager > Affects Versions: 2.0.0-alpha, 3.0.0 > Reporter: Ramgopal N > Attachments: MAPREDUCE-3676.patch > > > If NM is started before starting the RM ,NM is shutting down with the following error > {code} > ERROR org.apache.hadoop.yarn.service.CompositeService: Error starting services org.apache.hadoop.yarn.server.nodemanager.NodeManager > org.apache.avro.AvroRuntimeException: java.lang.reflect.UndeclaredThrowableException > at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.start(NodeStatusUpdaterImpl.java:149) > at org.apache.hadoop.yarn.service.CompositeService.start(CompositeService.java:68) > at org.apache.hadoop.yarn.server.nodemanager.NodeManager.start(NodeManager.java:167) > at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:242) > Caused by: java.lang.reflect.UndeclaredThrowableException > at org.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBClientImpl.registerNodeManager(ResourceTrackerPBClientImpl.java:66) > at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.registerWithRM(NodeStatusUpdaterImpl.java:182) > at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.start(NodeStatusUpdaterImpl.java:145) > ... 3 more > Caused by: com.google.protobuf.ServiceException: java.net.ConnectException: Call From HOST-10-18-52-230/10.18.52.230 to HOST-10-18-52-250:8025 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused > at org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:131) > at $Proxy23.registerNodeManager(Unknown Source) > at org.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBClientImpl.registerNodeManager(ResourceTrackerPBClientImpl.java:59) > ... 5 more > Caused by: java.net.ConnectException: Call From HOST-10-18-52-230/10.18.52.230 to HOST-10-18-52-250:8025 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused > at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:857) > at org.apache.hadoop.ipc.Client.call(Client.java:1141) > at org.apache.hadoop.ipc.Client.call(Client.java:1100) > at org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:128) > ... 7 more > Caused by: java.net.ConnectException: Connection refused > at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) > at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574) > at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) > at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:659) > at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:469) > at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:563) > at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:211) > at org.apache.hadoop.ipc.Client.getConnection(Client.java:1247) > at org.apache.hadoop.ipc.Client.call(Client.java:1117) > ... 9 more > 2012-01-16 15:04:13,336 WARN org.apache.hadoop.yarn.event.AsyncDispatcher: AsyncDispatcher thread interrupted > java.lang.InterruptedException > at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:1899) > at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1934) > at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:358) > at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:76) > at java.lang.Thread.run(Thread.java:619) > 2012-01-16 15:04:13,337 INFO org.apache.hadoop.yarn.service.AbstractService: Service:Dispatcher is stopped. > 2012-01-16 15:04:13,392 INFO org.mortbay.log: Stopped SelectChannelConnector@0.0.0.0:9999 > 2012-01-16 15:04:13,493 INFO org.apache.hadoop.yarn.service.AbstractService: Service:org.apache.hadoop.yarn.server.nodemanager.webapp.WebServer is stopped. > 2012-01-16 15:04:13,493 INFO org.apache.hadoop.ipc.Server: Stopping server on 24290 > 2012-01-16 15:04:13,494 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server listener on 24290 > 2012-01-16 15:04:13,495 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server Responder > 2012-01-16 15:04:13,496 INFO org.apache.hadoop.yarn.service.AbstractService: Service:org.apache.hadoop.yarn.server.nodemanager.containermanager.loghandler.NonAggregatingLogHandler is stopped. > 2012-01-16 15:04:13,496 WARN org.apache.hadoop.yarn.event.AsyncDispatcher: AsyncDispatcher thread interrupted > java.lang.InterruptedException > at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:1899) > at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1934) > at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:358) > at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:76) > at java.lang.Thread.run(Thread.java:619) > {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira