Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id BA7E3200BB4 for ; Tue, 1 Nov 2016 21:05:03 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id B90A1160AF7; Tue, 1 Nov 2016 20:05:03 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 3C43E160ADA for ; Tue, 1 Nov 2016 21:05:01 +0100 (CET) Received: (qmail 9225 invoked by uid 500); 1 Nov 2016 20:04:59 -0000 Mailing-List: contact yarn-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list yarn-issues@hadoop.apache.org Received: (qmail 9179 invoked by uid 99); 1 Nov 2016 20:04:59 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 01 Nov 2016 20:04:59 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id D32F02C2A67 for ; Tue, 1 Nov 2016 20:04:58 +0000 (UTC) Date: Tue, 1 Nov 2016 20:04:58 +0000 (UTC) From: "Jian He (JIRA)" To: yarn-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (YARN-5762) Summarize ApplicationNotFoundException in the RM log MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Tue, 01 Nov 2016 20:05:03 -0000 [ https://issues.apache.org/jira/browse/YARN-5762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15626511#comment-15626511 ] Jian He commented on YARN-5762: ------------------------------- [~raviprak], instead of ignoring the exception at server, I think we should fix the caller (AggregatedLogDeletionService) to not make a call for each application to check its state, this is also an expensive loop to RM. Instead, it can make one call to get all application reports and check each app's state. your opinion ? cc [~xgong] > Summarize ApplicationNotFoundException in the RM log > ---------------------------------------------------- > > Key: YARN-5762 > URL: https://issues.apache.org/jira/browse/YARN-5762 > Project: Hadoop YARN > Issue Type: Task > Affects Versions: 2.7.2 > Reporter: Ravi Prakash > Assignee: Ravi Prakash > Priority: Minor > Attachments: YARN-5762.01.patch > > > We found a lot of {{ApplicationNotFoundException}} in the RM logs. These were most likely caused by the {{AggregatedLogDeletionService}} [which checks|https://github.com/apache/hadoop/blob/262827cf75bf9c48cd95335eb04fd8ff1d64c538/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/logaggregation/AggregatedLogDeletionService.java#L156] that the application is not running anymore. e.g. > {code}2016-10-17 15:25:26,542 INFO org.apache.hadoop.ipc.Server: IPC Server handler 20 on 8032, call org.apache.hadoop.yarn.api.ApplicationClientProtocolPB.getApplicationReport from :12205 Call#35401 Retry#0 > org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application with id 'application_1473396553140_1451' doesn't exist in RM. > at org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationReport(ClientRMService.java:327) > at org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocolPBServiceImpl.java:175) > at org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:417) > at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1679) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043) > 2016-10-17 15:25:26,633 INFO org.apache.hadoop.ipc.Server: IPC Server handler 47 on 8032, call org.apache.hadoop.yarn.api.ApplicationClientProtocolPB.getApplicationReport from :12205 Call#35404 Retry#0 > org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application with id 'application_1473396553140_1452' doesn't exist in RM. > at org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationReport(ClientRMService.java:327) > at org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocolPBServiceImpl.java:175) > at org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:417) > at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1679) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org For additional commands, e-mail: yarn-issues-help@hadoop.apache.org