From yarn-dev-return-29799-archive-asf-public=cust-asf.ponee.io@hadoop.apache.org Thu Mar 8 16:56:09 2018 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id CBFCE18064C for ; Thu, 8 Mar 2018 16:56:07 +0100 (CET) Received: (qmail 43210 invoked by uid 500); 8 Mar 2018 15:56:06 -0000 Mailing-List: contact yarn-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list yarn-dev@hadoop.apache.org Received: (qmail 43199 invoked by uid 99); 8 Mar 2018 15:56:06 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 08 Mar 2018 15:56:06 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 1024BC1C7C for ; Thu, 8 Mar 2018 15:56:06 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -109.511 X-Spam-Level: X-Spam-Status: No, score=-109.511 tagged_above=-999 required=6.31 tests=[ENV_AND_HDR_SPF_MATCH=-0.5, KAM_ASCII_DIVIDERS=0.8, RCVD_IN_DNSWL_MED=-2.3, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01, USER_IN_DEF_SPF_WL=-7.5, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id 63zF1hgUQzoW for ; Thu, 8 Mar 2018 15:56:03 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id 432675FB59 for ; Thu, 8 Mar 2018 15:56:02 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 0B665E02F7 for ; Thu, 8 Mar 2018 15:56:01 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 638A125412 for ; Thu, 8 Mar 2018 15:56:00 +0000 (UTC) Date: Thu, 8 Mar 2018 15:56:00 +0000 (UTC) From: "Evan Tepsic (JIRA)" To: yarn-dev@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Resolved] (YARN-8014) YARN ResourceManager Lists A NodeManager As RUNNING & SHUTDOWN Simultaneously MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/YARN-8014?page=3Dcom.atlassian= .jira.plugin.system.issuetabpanels:all-tabpanel ] Evan Tepsic resolved YARN-8014. ------------------------------- Resolution: Fixed > YARN ResourceManager Lists A NodeManager As RUNNING & SHUTDOWN Simultaneo= usly > -------------------------------------------------------------------------= ---- > > Key: YARN-8014 > URL: https://issues.apache.org/jira/browse/YARN-8014 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager > Affects Versions: 2.8.2 > Reporter: Evan Tepsic > Priority: Minor > > A graceful shutdown & then startup of a NodeManager process using YARN/HD= FS v2.8.2 seems to successfully place the Node back into RUNNING state. How= ever, ResouceManager appears to keep the Node also in SHUTDOWN state. > =C2=A0 > *Steps To Reproduce:* > 1. SSH to host running NodeManager. > 2. Switch-to UserID that NodeManager is running as (hadoop). > 3. Execute cmd: /opt/hadoop/sbin/yarn-daemon.sh stop nodemanager > 4. Wait for NodeManager process to terminate gracefully. > 5. Confirm Node is in SHUTDOWN state via: [http://rb01rm01.local:8088/cl= uster/nodes] > 6. Execute cmd: /opt/hadoop/sbin/yarn-daemon.sh stop nodemanager > 7. Confirm Node is in RUNNING state via: [http://rb01rm01.local:8088/clu= ster/nodes] > =C2=A0 > *Investigation:* > 1. Review contents of ResourceManager + NodeManager log-files: > +ResourceManager log-[file:+|file:///+] > 2018-03-08 08:15:44,085 INFO org.apache.hadoop.yarn.server.resourcemanag= er.ResourceTrackerService: Node with node id : rb0101.local:43892 has shutd= own, hence unregistering the node. > 2018-03-08 08:15:44,092 INFO org.apache.hadoop.yarn.server.resourcemanag= er.rmnode.RMNodeImpl: Deactivating Node rb0101.local:43892 as it is now SHU= TDOWN > 2018-03-08 08:15:44,092 INFO org.apache.hadoop.yarn.server.resourcemanag= er.rmnode.RMNodeImpl: rb0101.local:43892 Node Transitioned from RUNNING to = SHUTDOWN > 2018-03-08 08:15:44,093 INFO org.apache.hadoop.yarn.server.resourcemanag= er.scheduler.fair.FairScheduler: Removed node rb0101.local:43892 cluster ca= pacity: > 2018-03-08 08:16:08,915 INFO org.apache.hadoop.yarn.server.resourcemanag= er.ResourceTrackerService: NodeManager from node rb0101.local(cmPort: 42627= httpPort: 8042) registered with capability: , assi= gned nodeId rb0101.local:42627 > 2018-03-08 08:16:08,916 INFO org.apache.hadoop.yarn.server.resourcemanag= er.rmnode.RMNodeImpl: rb0101.local:42627 Node Transitioned from NEW to RUNN= ING > 2018-03-08 08:16:08,916 INFO org.apache.hadoop.yarn.server.resourcemanag= er.scheduler.fair.FairScheduler: Added node rb0101.local:42627 cluster capa= city: > 2018-03-08 08:16:34,826 WARN org.apache.hadoop.ipc.Server: Large respons= e size 2976014 for call Call#428958 Retry#0 org.apache.hadoop.yarn.api.Appl= icationClientProtocolPB.getApplications from 192.168.1.100:44034 > =C2=A0 > +NodeManager log-[file:+|file:///+] > 2018-03-08 08:00:14,500 INFO org.apache.hadoop.yarn.server.nodemanager.c= ontainermanager.localizer.ResourceLocalizationService: Cache Size Before Cl= ean: 10720046250, Total Deleted: 0, Public > Deleted: 0, Private Deleted: 0 > 2018-03-08 08:10:14,498 INFO org.apache.hadoop.yarn.server.nodemanager.c= ontainermanager.localizer.ResourceLocalizationService: Cache Size Before Cl= ean: 10720046250, Total Deleted: 0, Public > Deleted: 0, Private Deleted: 0 > 2018-03-08 08:15:44,048 ERROR org.apache.hadoop.yarn.server.nodemanager.= NodeManager: RECEIVED SIGNAL 15: SIGTERM > 2018-03-08 08:15:44,101 INFO org.apache.hadoop.yarn.server.nodemanager.N= odeStatusUpdaterImpl: Successfully Unregistered the Node rb0101.local:43892= with ResourceManager. > 2018-03-08 08:15:44,114 INFO org.mortbay.log: Stopped HttpServer2$Select= ChannelConnectorWithSafeStartup@0.0.0.0:8042 > 2018-03-08 08:15:44,226 INFO org.apache.hadoop.ipc.Server: Stopping serv= er on 43892 > 2018-03-08 08:15:44,232 INFO org.apache.hadoop.ipc.Server: Stopping IPC = Server listener on 43892 > 2018-03-08 08:15:44,237 INFO org.apache.hadoop.ipc.Server: Stopping IPC = Server Responder > 2018-03-08 08:15:44,239 INFO org.apache.hadoop.yarn.server.nodemanager.c= ontainermanager.logaggregation.LogAggregationService: org.apache.hadoop.yar= n.server.nodemanager.containermanager.logag > gregation.LogAggregationService waiting for pending aggregation during e= xit > 2018-03-08 08:15:44,242 WARN org.apache.hadoop.yarn.server.nodemanager.c= ontainermanager.monitor.ContainersMonitorImpl: org.apache.hadoop.yarn.serve= r.nodemanager.containermanager.monitor.Cont > ainersMonitorImpl is interrupted. Exiting. > 2018-03-08 08:15:44,284 INFO org.apache.hadoop.ipc.Server: Stopping serv= er on 8040 > 2018-03-08 08:15:44,285 INFO org.apache.hadoop.ipc.Server: Stopping IPC = Server listener on 8040 > 2018-03-08 08:15:44,285 INFO org.apache.hadoop.ipc.Server: Stopping IPC = Server Responder > 2018-03-08 08:15:44,287 INFO org.apache.hadoop.yarn.server.nodemanager.c= ontainermanager.localizer.ResourceLocalizationService: Public cache exiting > 2018-03-08 08:15:44,289 WARN org.apache.hadoop.yarn.server.nodemanager.N= odeResourceMonitorImpl: org.apache.hadoop.yarn.server.nodemanager.NodeResou= rceMonitorImpl is interrupted. Exiting. > 2018-03-08 08:15:44,294 INFO org.apache.hadoop.metrics2.impl.MetricsSyst= emImpl: Stopping NodeManager metrics system... > 2018-03-08 08:15:44,295 INFO org.apache.hadoop.metrics2.impl.MetricsSyst= emImpl: NodeManager metrics system stopped. > 2018-03-08 08:15:44,296 INFO org.apache.hadoop.metrics2.impl.MetricsSyst= emImpl: NodeManager metrics system shutdown complete. > 2018-03-08 08:15:44,297 INFO org.apache.hadoop.yarn.server.nodemanager.N= odeManager: SHUTDOWN_MSG: > /************************************************************ > SHUTDOWN_MSG: Shutting down NodeManager at rb0101.local/192.168.1.101 > ************************************************************/ > 2018-03-08 08:16:01,905 INFO org.apache.hadoop.yarn.server.nodemanager.N= odeManager: STARTUP_MSG: > /************************************************************ > STARTUP_MSG: Starting NodeManager > STARTUP_MSG: user =3D hadoop > STARTUP_MSG: host =3D rb0101.local/192.168.1.101 > STARTUP_MSG: args =3D [] > STARTUP_MSG: version =3D 2.8.2 > STARTUP_MSG: classpath =3D blahblahblah (truncated for size-purposes) > STARTUP_MSG: build =3D Unknown -r Unknown; compiled by 'root' on 2017-09= -14T18:22Z > STARTUP_MSG: java =3D 1.8.0_144 > ************************************************************/ > 2018-03-08 08:16:01,918 INFO org.apache.hadoop.yarn.server.nodemanager.N= odeManager: registered UNIX signal handlers for [TERM, HUP, INT] > 2018-03-08 08:16:03,202 INFO org.apache.hadoop.yarn.server.nodemanager.N= odeManager: Node Manager health check script is not available or doesn't ha= ve execute permission, so not starting the > node health script runner. > 2018-03-08 08:16:03,321 INFO org.apache.hadoop.yarn.event.AsyncDispatche= r: Registering class org.apache.hadoop.yarn.server.nodemanager.containerman= ager.container.ContainerEventType for class > org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerMana= gerImpl$ContainerEventDispatcher > 2018-03-08 08:16:03,322 INFO org.apache.hadoop.yarn.event.AsyncDispatche= r: Registering class org.apache.hadoop.yarn.server.nodemanager.containerman= ager.application.ApplicationEventType for c > lass org.apache.hadoop.yarn.server.nodemanager.containermanager.Containe= rManagerImpl$ApplicationEventDispatcher > 2018-03-08 08:16:03,323 INFO org.apache.hadoop.yarn.event.AsyncDispatche= r: Registering class org.apache.hadoop.yarn.server.nodemanager.containerman= ager.localizer.event.LocalizationEventType > for class org.apache.hadoop.yarn.server.nodemanager.containermanager.loc= alizer.ResourceLocalizationService > 2018-03-08 08:16:03,323 INFO org.apache.hadoop.yarn.event.AsyncDispatche= r: Registering class org.apache.hadoop.yarn.server.nodemanager.containerman= ager.AuxServicesEventType for class org.apa > che.hadoop.yarn.server.nodemanager.containermanager.AuxServices > 2018-03-08 08:16:03,324 INFO org.apache.hadoop.yarn.event.AsyncDispatche= r: Registering class org.apache.hadoop.yarn.server.nodemanager.containerman= ager.monitor.ContainersMonitorEventType for > class org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor= .ContainersMonitorImpl > 2018-03-08 08:16:03,324 INFO org.apache.hadoop.yarn.event.AsyncDispatche= r: Registering class org.apache.hadoop.yarn.server.nodemanager.containerman= ager.launcher.ContainersLauncherEventType f > or class org.apache.hadoop.yarn.server.nodemanager.containermanager.laun= cher.ContainersLauncher > 2018-03-08 08:16:03,347 INFO org.apache.hadoop.yarn.event.AsyncDispatche= r: Registering class org.apache.hadoop.yarn.server.nodemanager.ContainerMan= agerEventType for class org.apache.hadoop.y > arn.server.nodemanager.containermanager.ContainerManagerImpl > 2018-03-08 08:16:03,348 INFO org.apache.hadoop.yarn.event.AsyncDispatche= r: Registering class org.apache.hadoop.yarn.server.nodemanager.NodeManagerE= ventType for class org.apache.hadoop.yarn.s > erver.nodemanager.NodeManager > 2018-03-08 08:16:03,402 INFO org.apache.hadoop.metrics2.impl.MetricsConf= ig: loaded properties from hadoop-metrics2.properties > 2018-03-08 08:16:03,484 INFO org.apache.hadoop.metrics2.impl.MetricsSyst= emImpl: Scheduled Metric snapshot period at 10 second(s). > 2018-03-08 08:16:03,484 INFO org.apache.hadoop.metrics2.impl.MetricsSyst= emImpl: NodeManager metrics system started > 2018-03-08 08:16:03,561 INFO org.apache.hadoop.yarn.server.nodemanager.N= odeResourceMonitorImpl: Using ResourceCalculatorPlugin : org.apache.hadoop.= yarn.util.ResourceCalculatorPlugin@4b8729f > f > 2018-03-08 08:16:03,564 INFO org.apache.hadoop.yarn.event.AsyncDispatche= r: Registering class org.apache.hadoop.yarn.server.nodemanager.containerman= ager.loghandler.event.LogHandlerEventType f > or class org.apache.hadoop.yarn.server.nodemanager.containermanager.loga= ggregation.LogAggregationService > 2018-03-08 08:16:03,565 INFO org.apache.hadoop.yarn.event.AsyncDispatche= r: Registering class org.apache.hadoop.yarn.server.nodemanager.containerman= ager.localizer.sharedcache.SharedCacheUploa > dEventType for class org.apache.hadoop.yarn.server.nodemanager.container= manager.localizer.sharedcache.SharedCacheUploadService > 2018-03-08 08:16:03,565 INFO org.apache.hadoop.yarn.server.nodemanager.c= ontainermanager.ContainerManagerImpl: AMRMProxyService is disabled > 2018-03-08 08:16:03,566 INFO org.apache.hadoop.yarn.server.nodemanager.c= ontainermanager.localizer.ResourceLocalizationService: per directory file l= imit =3D 8192 > 2018-03-08 08:16:03,621 INFO org.apache.hadoop.yarn.server.nodemanager.c= ontainermanager.localizer.ResourceLocalizationService: usercache path : [fi= le:/space/hadoop/tmp/nm-local-dir/usercache_|file:///space/hadoop/tmp/nm-lo= cal-dir/usercache_] > DEL_1520518563569 > 2018-03-08 08:16:03,667 INFO org.apache.hadoop.yarn.server.nodemanager.D= efaultContainerExecutor: Deleting path : [file:/space/hadoop/tmp/nm-local-d= ir/usercache_DEL_1520518563569/user1|file:///space/hadoop/tmp/nm-local-dir/= usercache_DEL_1520518563569/user1] > 2018-03-08 08:16:03,667 INFO org.apache.hadoop.yarn.server.nodemanager.D= efaultContainerExecutor: Deleting path : [file:/space/hadoop/tmp/nm-local-d= ir/usercache_DEL_1520518563569/user2|file:///space/hadoop/tmp/nm-local-dir/= usercache_DEL_1520518563569/user2] > 2018-03-08 08:16:03,668 INFO org.apache.hadoop.yarn.server.nodemanager.D= efaultContainerExecutor: Deleting path : [file:/space/hadoop/tmp/nm-local-d= ir/usercache_DEL_1520518563569/user3|file:///space/hadoop/tmp/nm-local-dir/= usercache_DEL_1520518563569/user3] > 2018-03-08 08:16:03,681 INFO org.apache.hadoop.yarn.server.nodemanager.D= efaultContainerExecutor: Deleting path : [file:/space/hadoop/tmp/nm-local-d= ir/usercache_DEL_1520518563569/user4|file:///space/hadoop/tmp/nm-local-dir/= usercache_DEL_1520518563569/user4] > 2018-03-08 08:16:03,739 INFO org.apache.hadoop.yarn.event.AsyncDispatche= r: Registering class org.apache.hadoop.yarn.server.nodemanager.containerman= ager.localizer.event.LocalizerEventType for class org.apache.hadoop.yarn.se= rver.nodemanager.containermanager.localizer.ResourceLocalizationService$Loc= alizerTracker > 2018-03-08 08:16:03,793 INFO org.apache.hadoop.yarn.server.nodemanager.c= ontainermanager.AuxServices: Adding auxiliary service mapreduce_shuffle, "m= apreduce_shuffle" > 2018-03-08 08:16:03,826 INFO org.apache.hadoop.yarn.server.nodemanager.c= ontainermanager.monitor.ContainersMonitorImpl: Using ResourceCalculatorPlug= in : org.apache.hadoop.yarn.util.ResourceCalculatorPlugin@1187c9e8 > 2018-03-08 08:16:03,826 INFO org.apache.hadoop.yarn.server.nodemanager.c= ontainermanager.monitor.ContainersMonitorImpl: Using ResourceCalculatorProc= essTree : null > 2018-03-08 08:16:03,827 INFO org.apache.hadoop.yarn.server.nodemanager.c= ontainermanager.monitor.ContainersMonitorImpl: Physical memory check enable= d: true > 2018-03-08 08:16:03,827 INFO org.apache.hadoop.yarn.server.nodemanager.c= ontainermanager.monitor.ContainersMonitorImpl: Virtual memory check enabled= : true > 2018-03-08 08:16:03,832 INFO org.apache.hadoop.yarn.server.nodemanager.c= ontainermanager.monitor.ContainersMonitorImpl: ContainersMonitor enabled: t= rue > 2018-03-08 08:16:03,841 INFO org.apache.hadoop.yarn.server.nodemanager.N= odeStatusUpdaterImpl: Nodemanager resources: memory set to 12288MB. > 2018-03-08 08:16:03,841 INFO org.apache.hadoop.yarn.server.nodemanager.N= odeStatusUpdaterImpl: Nodemanager resources: vcores set to 6. > 2018-03-08 08:16:03,846 INFO org.apache.hadoop.yarn.server.nodemanager.N= odeStatusUpdaterImpl: Initialized nodemanager with : physical-memory=3D1228= 8 virtual-memory=3D25805 virtual-cores=3D6 > 2018-03-08 08:16:03,850 INFO org.apache.hadoop.util.JvmPauseMonitor: Sta= rting JVM pause monitor > 2018-03-08 08:16:03,908 INFO org.apache.hadoop.ipc.CallQueueManager: Usi= ng callQueue: class java.util.concurrent.LinkedBlockingQueue queueCapacity:= 2000 scheduler: class org.apache.hadoop.ipc.DefaultRpcScheduler > 2018-03-08 08:16:03,932 INFO org.apache.hadoop.ipc.Server: Starting Sock= et Reader #1 for port 42627 > 2018-03-08 08:16:04,153 INFO org.apache.hadoop.yarn.factories.impl.pb.Rp= cServerFactoryPBImpl: Adding protocol org.apache.hadoop.yarn.api.ContainerM= anagementProtocolPB to the server > 2018-03-08 08:16:04,153 INFO org.apache.hadoop.yarn.server.nodemanager.c= ontainermanager.ContainerManagerImpl: Blocking new container-requests as co= ntainer manager rpc server is still starting. > 2018-03-08 08:16:04,154 INFO org.apache.hadoop.ipc.Server: IPC Server Re= sponder: starting > 2018-03-08 08:16:04,154 INFO org.apache.hadoop.ipc.Server: IPC Server li= stener on 42627: starting > 2018-03-08 08:16:04,166 INFO org.apache.hadoop.yarn.server.nodemanager.s= ecurity.NMContainerTokenSecretManager: Updating node address : rb0101.local= :42627 > 2018-03-08 08:16:04,183 INFO org.apache.hadoop.ipc.CallQueueManager: Usi= ng callQueue: class java.util.concurrent.LinkedBlockingQueue queueCapacity:= 500 scheduler: class org.apache.hadoop.ipc.DefaultRpcScheduler > 2018-03-08 08:16:04,184 INFO org.apache.hadoop.ipc.Server: Starting Sock= et Reader #1 for port 8040 > 2018-03-08 08:16:04,191 INFO org.apache.hadoop.yarn.factories.impl.pb.Rp= cServerFactoryPBImpl: Adding protocol org.apache.hadoop.yarn.server.nodeman= ager.api.LocalizationProtocolPB to the server > 2018-03-08 08:16:04,191 INFO org.apache.hadoop.ipc.Server: IPC Server Re= sponder: starting > 2018-03-08 08:16:04,191 INFO org.apache.hadoop.ipc.Server: IPC Server li= stener on 8040: starting > 2018-03-08 08:16:04,192 INFO org.apache.hadoop.yarn.server.nodemanager.c= ontainermanager.localizer.ResourceLocalizationService: Localizer started on= port 8040 > 2018-03-08 08:16:04,312 INFO org.apache.hadoop.mapred.IndexCache: IndexC= ache created with max memory =3D 10485760 > 2018-03-08 08:16:04,330 INFO org.apache.hadoop.mapred.ShuffleHandler: ma= preduce_shuffle listening on port 13562 > 2018-03-08 08:16:04,337 INFO org.apache.hadoop.yarn.server.nodemanager.c= ontainermanager.ContainerManagerImpl: ContainerManager started at rb0101.lo= cal/192.168.1.101:42627 > 2018-03-08 08:16:04,337 INFO org.apache.hadoop.yarn.server.nodemanager.c= ontainermanager.ContainerManagerImpl: ContainerManager bound to 0.0.0.0/0.0= .0.0:0 > 2018-03-08 08:16:04,340 INFO org.apache.hadoop.yarn.server.nodemanager.w= ebapp.WebServer: Instantiating NMWebApp at 0.0.0.0:8042 > 2018-03-08 08:16:04,427 INFO org.mortbay.log: Logging to org.slf4j.impl.= Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog > 2018-03-08 08:16:04,436 INFO org.apache.hadoop.security.authentication.s= erver.AuthenticationFilter: Unable to initialize FileSignerSecretProvider, = falling back to use random secrets. > 2018-03-08 08:16:04,442 INFO org.apache.hadoop.http.HttpRequestLog: Http= request log for http.requests.nodemanager is not defined > 2018-03-08 08:16:04,450 INFO org.apache.hadoop.http.HttpServer2: Added g= lobal filter 'safety' (class=3Dorg.apache.hadoop.http.HttpServer2$QuotingIn= putFilter) > 2018-03-08 08:16:04,461 INFO org.apache.hadoop.http.HttpServer2: Added f= ilter static_user_filter (class=3Dorg.apache.hadoop.http.lib.StaticUserWebF= ilter$StaticUserFilter) to context node > 2018-03-08 08:16:04,462 INFO org.apache.hadoop.http.HttpServer2: Added f= ilter static_user_filter (class=3Dorg.apache.hadoop.http.lib.StaticUserWebF= ilter$StaticUserFilter) to context logs > 2018-03-08 08:16:04,462 INFO org.apache.hadoop.http.HttpServer2: Added f= ilter static_user_filter (class=3Dorg.apache.hadoop.http.lib.StaticUserWebF= ilter$StaticUserFilter) to context static > 2018-03-08 08:16:04,462 INFO org.apache.hadoop.security.HttpCrossOriginF= ilterInitializer: CORS filter not enabled. Please set hadoop.http.cross-ori= gin.enabled to 'true' to enable it > 2018-03-08 08:16:04,465 INFO org.apache.hadoop.http.HttpServer2: adding = path spec: /node/* > 2018-03-08 08:16:04,465 INFO org.apache.hadoop.http.HttpServer2: adding = path spec: /ws/* > 2018-03-08 08:16:04,843 INFO org.apache.hadoop.yarn.webapp.WebApps: Regi= stered webapp guice modules > 2018-03-08 08:16:04,846 INFO org.apache.hadoop.http.HttpServer2: Jetty b= ound to port 8042 > 2018-03-08 08:16:04,846 INFO org.mortbay.log: jetty-6.1.26 > 2018-03-08 08:16:04,877 INFO org.mortbay.log: Extract jar:[file:/opt/had= oop-2.8.2/share/hadoop/yarn/hadoop-yarn-common-2.8.2.jar!/webapps/node|file= :///opt/hadoop-2.8.2/share/hadoop/yarn/hadoop-yarn-common-2.8.2.jar!/webapp= s/node] to /tmp/Jetty_0_0_0_0_8042_node____19tj0x/webapp > 2018-03-08 08:16:08,355 INFO org.mortbay.log: Started HttpServer2$Select= ChannelConnectorWithSafeStartup@0.0.0.0:8042 > 2018-03-08 08:16:08,356 INFO org.apache.hadoop.yarn.webapp.WebApps: Web = app node started at 8042 > 2018-03-08 08:16:08,473 INFO org.apache.hadoop.yarn.server.nodemanager.N= odeStatusUpdaterImpl: Node ID assigned is : rb0101.local:42627 > 2018-03-08 08:16:08,498 INFO org.apache.hadoop.yarn.client.RMProxy: Conn= ecting to ResourceManager at rb01rm01.local/192.168.1.100:8031 > 2018-03-08 08:16:08,613 INFO org.apache.hadoop.yarn.server.nodemanager.N= odeStatusUpdaterImpl: Sending out 0 NM container statuses: [] > 2018-03-08 08:16:08,621 INFO org.apache.hadoop.yarn.server.nodemanager.N= odeStatusUpdaterImpl: Registering with RM using containers :[] > 2018-03-08 08:16:08,934 INFO org.apache.hadoop.yarn.server.nodemanager.s= ecurity.NMContainerTokenSecretManager: Rolling master-key for container-tok= ens, got key with id -2086472604 > 2018-03-08 08:16:08,938 INFO org.apache.hadoop.yarn.server.nodemanager.s= ecurity.NMTokenSecretManagerInNM: Rolling master-key for container-tokens, = got key with id -426187560 > 2018-03-08 08:16:08,939 INFO org.apache.hadoop.yarn.server.nodemanager.N= odeStatusUpdaterImpl: Registered with ResourceManager as rb0101.local:42627= with total resource of > 2018-03-08 08:16:08,939 INFO org.apache.hadoop.yarn.server.nodemanager.N= odeStatusUpdaterImpl: Notifying ContainerManager to unblock new container-r= equests > 2018-03-08 08:26:04,174 INFO org.apache.hadoop.yarn.server.nodemanager.c= ontainermanager.localizer.ResourceLocalizationService: Cache Size Before Cl= ean: 0, Total Deleted: 0, Public Deleted: 0, Private Deleted: 0 > 2018-03-08 08:36:04,170 INFO org.apache.hadoop.yarn.server.nodemanager.c= ontainermanager.localizer.ResourceLocalizationService: Cache Size Before Cl= ean: 0, Total Deleted: 0, Public Deleted: 0, Private Deleted: 0 > 2018-03-08 08:46:04,170 INFO org.apache.hadoop.yarn.server.nodemanager.c= ontainermanager.localizer.ResourceLocalizationService: Cache Size Before Cl= ean: 0, Total Deleted: 0, Public Deleted: 0, Private Deleted: 0 > 2. Listing all of YARN's Nodes, we can see it was returned to the RUNNING= state. However, when listing all nodes, it shows the node in 2 states; RUN= NING and SHUTDOWN: > [hadoop@rb01rm01 logs]$ /opt/hadoop/bin/yarn node -list -all > 18/03/08 09:20:33 INFO client.RMProxy: Connecting to ResourceManager at = rb01rm01.local/192.168.1.100:8032 > 18/03/08 09:20:34 INFO client.AHSProxy: Connecting to Application Histor= y server at rb01rm01.local/192.168.1.100:10200 > Total Nodes:11 > Node-Id Node-State Node-Http-Address Number-of-Running-Containers > rb0106.local:44160 RUNNING rb0106.local:8042 0 > rb0105.local:32832 RUNNING rb0105.local:8042 0 > rb0101.local:42627 RUNNING rb0101.local:8042 0 > rb0108.local:38209 RUNNING rb0108.local:8042 0 > rb0107.local:34306 RUNNING rb0107.local:8042 0 > rb0102.local:43063 RUNNING rb0102.local:8042 0 > rb0103.local:42374 RUNNING rb0103.local:8042 0 > rb0109.local:37455 RUNNING rb0109.local:8042 0 > rb0110.local:36690 RUNNING rb0110.local:8042 0 > rb0104.local:33268 RUNNING rb0104.local:8042 0 > rb0101.local:43892 SHUTDOWN rb0101.local:8042 0 > [hadoop@rb01rm01 logs]$ > [hadoop@rb01rm01 logs]$ /opt/hadoop/bin/yarn node -list -states RUNNING > 18/03/08 09:20:55 INFO client.RMProxy: Connecting to ResourceManager at = rb01rm01.local/192.168.1.100:8032 > 18/03/08 09:20:56 INFO client.AHSProxy: Connecting to Application Histor= y server at rb01rm01.local/192.168.1.100:10200 > Total Nodes:10 > Node-Id Node-State Node-Http-Address Number-of-Running-Containers > rb0106.local:44160 RUNNING rb0106.local:8042 0 > rb0105.local:32832 RUNNING rb0105.local:8042 0 > rb0101.local:42627 RUNNING rb0101.local:8042 0 > rb0108.local:38209 RUNNING rb0108.local:8042 0 > rb0107.local:34306 RUNNING rb0107.local:8042 0 > rb0102.local:43063 RUNNING rb0102.local:8042 0 > rb0103.local:42374 RUNNING rb0103.local:8042 0 > rb0109.local:37455 RUNNING rb0109.local:8042 0 > rb0110.local:36690 RUNNING rb0110.local:8042 0 > rb0104.local:33268 RUNNING rb0104.local:8042 0 > [hadoop@rb01rm01 logs]$ /opt/hadoop/bin/yarn node -list -states SHUTDOWN > 18/03/08 09:21:01 INFO client.RMProxy: Connecting to ResourceManager at = rb01rm01.local/192.168.1.100:8032 > 18/03/08 09:21:01 INFO client.AHSProxy: Connecting to Application Histor= y server at rb01rm01.local/192.168.1.100:10200 > Total Nodes:0 > Node-Id Node-State Node-Http-Address Number-of-Running-Containers > [hadoop@rb01rm01 logs]$ > 3. ResourceManager however, does not list Node rb0101.local as SHUTDOWN w= hen specifically requesting list of Nodes in SHUTDOWN state: > [hadoop@rb01rm01 bin]$ /opt/hadoop/bin/yarn node -list -states SHUTDOWN > 18/03/08 08:28:23 INFO client.RMProxy: Connecting to ResourceManager at = rb01rm01.local/v.x.y.z:8032 > 18/03/08 08:28:24 INFO client.AHSProxy: Connecting to Application Histor= y server at rb01rm01.local/v.x.y.z:10200 > Total Nodes:0 > Node-Id Node-State Node-Http-Address Number-of-Running-Containers > [hadoop@rb01rm01 bin]$ -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org For additional commands, e-mail: yarn-dev-help@hadoop.apache.org