Return-Path: X-Original-To: apmail-hadoop-common-commits-archive@www.apache.org Delivered-To: apmail-hadoop-common-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 05C36184B3 for ; Fri, 8 May 2015 22:47:36 +0000 (UTC) Received: (qmail 60040 invoked by uid 500); 8 May 2015 22:47:35 -0000 Delivered-To: apmail-hadoop-common-commits-archive@hadoop.apache.org Received: (qmail 59971 invoked by uid 500); 8 May 2015 22:47:35 -0000 Mailing-List: contact common-commits-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-dev@hadoop.apache.org Delivered-To: mailing list common-commits@hadoop.apache.org Received: (qmail 59961 invoked by uid 99); 8 May 2015 22:47:35 -0000 Received: from git1-us-west.apache.org (HELO git1-us-west.apache.org) (140.211.11.23) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 08 May 2015 22:47:35 +0000 Received: by git1-us-west.apache.org (ASF Mail Server at git1-us-west.apache.org, from userid 33) id A8CD3E0984; Fri, 8 May 2015 22:47:35 +0000 (UTC) Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit From: jlowe@apache.org To: common-commits@hadoop.apache.org Message-Id: <6f904564526d40e3a5734d3a06ab2e2c@git.apache.org> X-Mailer: ASF-Git Admin Mailer Subject: hadoop git commit: YARN-3476. Nodemanager can fail to delete local logs if log aggregation fails. Contributed by Rohith (cherry picked from commit 25e2b02122c4ed760227ab33c49d3445c23b9276) Date: Fri, 8 May 2015 22:47:35 +0000 (UTC) Repository: hadoop Updated Branches: refs/heads/branch-2.7 f57b1bfbd -> a75f4bed6 YARN-3476. Nodemanager can fail to delete local logs if log aggregation fails. Contributed by Rohith (cherry picked from commit 25e2b02122c4ed760227ab33c49d3445c23b9276) Project: http://git-wip-us.apache.org/repos/asf/hadoop/repo Commit: http://git-wip-us.apache.org/repos/asf/hadoop/commit/a75f4bed Tree: http://git-wip-us.apache.org/repos/asf/hadoop/tree/a75f4bed Diff: http://git-wip-us.apache.org/repos/asf/hadoop/diff/a75f4bed Branch: refs/heads/branch-2.7 Commit: a75f4bed6e8820d4b50c029a0ce47d5245672f8b Parents: f57b1bf Author: Jason Lowe Authored: Fri May 8 22:45:52 2015 +0000 Committer: Jason Lowe Committed: Fri May 8 22:47:18 2015 +0000 ---------------------------------------------------------------------- hadoop-yarn-project/CHANGES.txt | 3 +++ .../logaggregation/AppLogAggregatorImpl.java | 19 ++++++++++++++----- 2 files changed, 17 insertions(+), 5 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/hadoop/blob/a75f4bed/hadoop-yarn-project/CHANGES.txt ---------------------------------------------------------------------- diff --git a/hadoop-yarn-project/CHANGES.txt b/hadoop-yarn-project/CHANGES.txt index 9bb96d1..10b2a93 100644 --- a/hadoop-yarn-project/CHANGES.txt +++ b/hadoop-yarn-project/CHANGES.txt @@ -71,6 +71,9 @@ Release 2.7.1 - UNRELEASED YARN-3554. Default value for maximum nodemanager connect wait time is too high (Naganarasimha G R via jlowe) + YARN-3476. Nodemanager can fail to delete local logs if log aggregation + fails (Rohith via jlowe) + Release 2.7.0 - 2015-04-20 INCOMPATIBLE CHANGES http://git-wip-us.apache.org/repos/asf/hadoop/blob/a75f4bed/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/AppLogAggregatorImpl.java ---------------------------------------------------------------------- diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/AppLogAggregatorImpl.java b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/AppLogAggregatorImpl.java index ff70a68..e3d0819 100644 --- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/AppLogAggregatorImpl.java +++ b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/AppLogAggregatorImpl.java @@ -385,6 +385,11 @@ public class AppLogAggregatorImpl implements AppLogAggregator { public void run() { try { doAppLogAggregation(); + } catch (Exception e) { + // do post clean up of log directories on any exception + LOG.error("Error occured while aggregating the log for the application " + + appId, e); + doAppLogAggregationPostCleanUp(); } finally { if (!this.appAggregationFinished.get()) { LOG.warn("Aggregation did not complete for application " + appId); @@ -422,6 +427,15 @@ public class AppLogAggregatorImpl implements AppLogAggregator { // App is finished, upload the container logs. uploadLogsForContainers(true); + doAppLogAggregationPostCleanUp(); + + this.dispatcher.getEventHandler().handle( + new ApplicationEvent(this.appId, + ApplicationEventType.APPLICATION_LOG_HANDLING_FINISHED)); + this.appAggregationFinished.set(true); + } + + private void doAppLogAggregationPostCleanUp() { // Remove the local app-log-dirs List localAppLogDirs = new ArrayList(); for (String rootLogDir : dirsHandler.getLogDirsForCleanup()) { @@ -442,11 +456,6 @@ public class AppLogAggregatorImpl implements AppLogAggregator { this.delService.delete(this.userUgi.getShortUserName(), null, localAppLogDirs.toArray(new Path[localAppLogDirs.size()])); } - - this.dispatcher.getEventHandler().handle( - new ApplicationEvent(this.appId, - ApplicationEventType.APPLICATION_LOG_HANDLING_FINISHED)); - this.appAggregationFinished.set(true); } private Path getRemoteNodeTmpLogFileForApp() {