Return-Path: X-Original-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 7B70218B37 for ; Tue, 8 Dec 2015 06:35:12 +0000 (UTC) Received: (qmail 8754 invoked by uid 500); 8 Dec 2015 06:35:12 -0000 Delivered-To: apmail-hadoop-yarn-issues-archive@hadoop.apache.org Received: (qmail 8361 invoked by uid 500); 8 Dec 2015 06:35:12 -0000 Mailing-List: contact yarn-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: yarn-issues@hadoop.apache.org Delivered-To: mailing list yarn-issues@hadoop.apache.org Received: (qmail 8305 invoked by uid 99); 8 Dec 2015 06:35:11 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 08 Dec 2015 06:35:11 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 655212C14F9 for ; Tue, 8 Dec 2015 06:35:11 +0000 (UTC) Date: Tue, 8 Dec 2015 06:35:11 +0000 (UTC) From: "Jian He (JIRA)" To: yarn-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Comment Edited] (YARN-4424) YARN CLI command hangs MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/YARN-4424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15046453#comment-15046453 ] Jian He edited comment on YARN-4424 at 12/8/15 6:34 AM: -------------------------------------------------------- sorry, I pasted a wrong thread stack, below thread is holding the RMAppImp's read lock and trying to access RMAppAttemptImpl. Thanks for checking so carefully ! {code} Thread 53732: (state = BLOCKED) - sun.misc.Unsafe.park(boolean, long) @bci=0 (Compiled frame; information may be imprecise) - java.util.concurrent.locks.LockSupport.park(java.lang.Object) @bci=14, line=186 (Interpreted frame) - java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt() @bci=1, line=834 (Interpreted frame) - java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(int) @bci=83, line=964 (Interpreted frame) - java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(int) @bci=10, line=1282 (Interpreted frame) - java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock() @bci=5, line=731 (Interpreted frame) - org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.getTrackingUrl() @bci=4, line=511 (Interpreted frame) - org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.createAndGetApplicationReport(java.lang.String, boolean) @bci=82, line=618 (Interpreted frame) - org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationReport(org.apache.hadoop.yarn.api.protocolrecords.GetApplicationReportRequest) @bci=116, line=334 (Interpreted frame) - org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(com.google.protobuf.RpcController, org.apache.hadoop.yarn.proto.YarnServiceProtos$GetApplicationReportRequestProto) @bci=14, line=175 (Interpreted frame) - org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(com.google.protobuf.Descriptors$MethodDescriptor, com.google.protobuf.RpcController, com.google.protobuf.Message) @bci=156, line=417 (Interpreted frame) - org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(org.apache.hadoop.ipc.RPC$Server, java.lang.String, org.apache.hadoop.io.Writable, long) @bci=246, line=616 (Interpreted frame) - org.apache.hadoop.ipc.RPC$Server.call(org.apache.hadoop.ipc.RPC$RpcKind, java.lang.String, org.apache.hadoop.io.Writable, long) @bci=9, line=969 (Interpreted frame) - org.apache.hadoop.ipc.Server$Handler$1.run() @bci=38, line=2151 (Interpreted frame) - org.apache.hadoop.ipc.Server$Handler$1.run() @bci=1, line=2147 (Interpreted frame) - java.security.AccessController.doPrivileged(java.security.PrivilegedExceptionAction, java.security.AccessControlContext) @bci=0 (Interpreted frame) - javax.security.auth.Subject.doAs(javax.security.auth.Subject, java.security.PrivilegedExceptionAction) @bci=42, line=415 (Interpreted frame) - org.apache.hadoop.security.UserGroupInformation.doAs(java.security.PrivilegedExceptionAction) @bci=14, line=1657 (Interpreted frame) - org.apache.hadoop.ipc.Server$Handler.run() @bci=315, line=2145 (Interpreted frame) {code} was (Author: jianhe): sorry, I pasted a wrong thread stack, below thread is holding the RMAppImp's read lock and trying to access RMAppAttemptImpl. Thanks for looking checking so carefully ! {code} Thread 53732: (state = BLOCKED) - sun.misc.Unsafe.park(boolean, long) @bci=0 (Compiled frame; information may be imprecise) - java.util.concurrent.locks.LockSupport.park(java.lang.Object) @bci=14, line=186 (Interpreted frame) - java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt() @bci=1, line=834 (Interpreted frame) - java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(int) @bci=83, line=964 (Interpreted frame) - java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(int) @bci=10, line=1282 (Interpreted frame) - java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock() @bci=5, line=731 (Interpreted frame) - org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.getTrackingUrl() @bci=4, line=511 (Interpreted frame) - org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.createAndGetApplicationReport(java.lang.String, boolean) @bci=82, line=618 (Interpreted frame) - org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationReport(org.apache.hadoop.yarn.api.protocolrecords.GetApplicationReportRequest) @bci=116, line=334 (Interpreted frame) - org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(com.google.protobuf.RpcController, org.apache.hadoop.yarn.proto.YarnServiceProtos$GetApplicationReportRequestProto) @bci=14, line=175 (Interpreted frame) - org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(com.google.protobuf.Descriptors$MethodDescriptor, com.google.protobuf.RpcController, com.google.protobuf.Message) @bci=156, line=417 (Interpreted frame) - org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(org.apache.hadoop.ipc.RPC$Server, java.lang.String, org.apache.hadoop.io.Writable, long) @bci=246, line=616 (Interpreted frame) - org.apache.hadoop.ipc.RPC$Server.call(org.apache.hadoop.ipc.RPC$RpcKind, java.lang.String, org.apache.hadoop.io.Writable, long) @bci=9, line=969 (Interpreted frame) - org.apache.hadoop.ipc.Server$Handler$1.run() @bci=38, line=2151 (Interpreted frame) - org.apache.hadoop.ipc.Server$Handler$1.run() @bci=1, line=2147 (Interpreted frame) - java.security.AccessController.doPrivileged(java.security.PrivilegedExceptionAction, java.security.AccessControlContext) @bci=0 (Interpreted frame) - javax.security.auth.Subject.doAs(javax.security.auth.Subject, java.security.PrivilegedExceptionAction) @bci=42, line=415 (Interpreted frame) - org.apache.hadoop.security.UserGroupInformation.doAs(java.security.PrivilegedExceptionAction) @bci=14, line=1657 (Interpreted frame) - org.apache.hadoop.ipc.Server$Handler.run() @bci=315, line=2145 (Interpreted frame) {code} > YARN CLI command hangs > ---------------------- > > Key: YARN-4424 > URL: https://issues.apache.org/jira/browse/YARN-4424 > Project: Hadoop YARN > Issue Type: Bug > Reporter: Yesha Vora > Assignee: Jian He > Priority: Blocker > Attachments: YARN-4424.1.patch > > > {code} > yarn@XXX:/mnt/hadoopqe$ /usr/hdp/current/hadoop-yarn-client/bin/yarn application -list -appStates NEW,NEW_SAVING,SUBMITTED,ACCEPTED,RUNNING > 15/12/04 21:59:54 INFO impl.TimelineClientImpl: Timeline service address: http://XXX:8188/ws/v1/timeline/ > 15/12/04 21:59:54 INFO client.RMProxy: Connecting to ResourceManager at XXX/0.0.0.0:8050 > 15/12/04 21:59:55 INFO client.AHSProxy: Connecting to Application History server at XXX/0.0.0.0:10200 > {code} > {code:title=RM log} > 2015-12-04 21:59:19,744 INFO event.AsyncDispatcher (AsyncDispatcher.java:handle(243)) - Size of event-queue is 237000 > 2015-12-04 22:00:50,945 INFO event.AsyncDispatcher (AsyncDispatcher.java:handle(243)) - Size of event-queue is 238000 > 2015-12-04 22:02:22,416 INFO event.AsyncDispatcher (AsyncDispatcher.java:handle(243)) - Size of event-queue is 239000 > 2015-12-04 22:03:53,593 INFO event.AsyncDispatcher (AsyncDispatcher.java:handle(243)) - Size of event-queue is 240000 > 2015-12-04 22:05:24,856 INFO event.AsyncDispatcher (AsyncDispatcher.java:handle(243)) - Size of event-queue is 241000 > 2015-12-04 22:06:56,235 INFO event.AsyncDispatcher (AsyncDispatcher.java:handle(243)) - Size of event-queue is 242000 > 2015-12-04 22:08:27,510 INFO event.AsyncDispatcher (AsyncDispatcher.java:handle(243)) - Size of event-queue is 243000 > 2015-12-04 22:09:58,786 INFO event.AsyncDispatcher (AsyncDispatcher.java:handle(243)) - Size of event-queue is 244000 > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)