Return-Path: X-Original-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id C67FD18496 for ; Thu, 25 Feb 2016 02:12:24 +0000 (UTC) Received: (qmail 73326 invoked by uid 500); 25 Feb 2016 02:12:18 -0000 Delivered-To: apmail-hadoop-yarn-issues-archive@hadoop.apache.org Received: (qmail 73275 invoked by uid 500); 25 Feb 2016 02:12:18 -0000 Mailing-List: contact yarn-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: yarn-issues@hadoop.apache.org Delivered-To: mailing list yarn-issues@hadoop.apache.org Received: (qmail 73249 invoked by uid 99); 25 Feb 2016 02:12:18 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 25 Feb 2016 02:12:18 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 356112C1F56 for ; Thu, 25 Feb 2016 02:12:18 +0000 (UTC) Date: Thu, 25 Feb 2016 02:12:18 +0000 (UTC) From: "Ming Ma (JIRA)" To: yarn-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (YARN-4720) Skip unnecessary NN operations in log aggregation MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/YARN-4720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15166575#comment-15166575 ] Ming Ma commented on YARN-4720: ------------------------------- It seems that {{LogAggregationStatus.RUNNING}} implies the log aggregation service is running, it doesn't necessarily mean NM actually aggregate any logs. So if the long running service is running and hasn't generate any logs since it starts, it is better to return {{LogAggregationStatus.RUNNING}}. Yes, NM can send several {{LogAggregationReport}}s in the list which is ordered; that is the API between NM and RM. Then on RM side, it will retrieve all elements from the list. > Skip unnecessary NN operations in log aggregation > ------------------------------------------------- > > Key: YARN-4720 > URL: https://issues.apache.org/jira/browse/YARN-4720 > Project: Hadoop YARN > Issue Type: Improvement > Reporter: Ming Ma > Assignee: Jun Gong > Attachments: YARN-4720.01.patch, YARN-4720.02.patch > > > Log aggregation service could have unnecessary NN operations in the following scenarios: > * No new local log has been created since the last upload for the long running service scenario. > * NM uses {{ContainerLogAggregationPolicy}} that skips log aggregation for certain containers. > In the following code snippet, even though {{pendingContainerInThisCycle}} is empty, it still creates the writer and then removes the file later. Thus it introduces unnecessary create/getfileinfo/delete NN calls when NM doesn't aggregate logs for an app. > > {noformat} > AppLogAggregatorImpl.java > ...... > writer = > new LogWriter(this.conf, this.remoteNodeTmpLogFileForApp, > this.userUgi); > ...... > for (ContainerId container : pendingContainerInThisCycle) { > ...... > } > ...... > if (remoteFS.exists(remoteNodeTmpLogFileForApp)) { > if (rename) { > remoteFS.rename(remoteNodeTmpLogFileForApp, renamedPath); > } else { > remoteFS.delete(remoteNodeTmpLogFileForApp, false); > } > } > ...... > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)