Return-Path: X-Original-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 1F1AE1122D for ; Wed, 17 Sep 2014 18:04:36 +0000 (UTC) Received: (qmail 64467 invoked by uid 500); 17 Sep 2014 18:04:36 -0000 Delivered-To: apmail-hadoop-yarn-issues-archive@hadoop.apache.org Received: (qmail 64421 invoked by uid 500); 17 Sep 2014 18:04:35 -0000 Mailing-List: contact yarn-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: yarn-issues@hadoop.apache.org Delivered-To: mailing list yarn-issues@hadoop.apache.org Received: (qmail 64409 invoked by uid 99); 17 Sep 2014 18:04:35 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 17 Sep 2014 18:04:35 +0000 Date: Wed, 17 Sep 2014 18:04:35 +0000 (UTC) From: "Xuan Gong (JIRA)" To: yarn-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (YARN-2468) Log handling for LRS MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/YARN-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14137661#comment-14137661 ] Xuan Gong commented on YARN-2468: --------------------------------- bq. If LogContext is not specified, we're running into the traditional log handling case, right? We will still have a combined log file identified by the node id? Or node id will always be the directory, and there exists only one file under it? node id will always be the directory, and there exists only one file under it bq. Let's say if work-preserving NM restarting happens, NM is going to forget all the uploaded logs files, and redo everything, right? If NM restarts happens, it will upload all logs which are previous uploaded, but not deleted. I think that we can solve this problem in separate ticket, because this ticket is the first step to solve Log handling for LRS. bq. LogContext doesn't need to be in ApplicatonSubmissionContext, because ApplicatonSubmissionContext contains ContainerLaunchContext. LogContext is container related stuff, such that ContainerLaunchContext should be the best place. Concurrently, we can have one context for all containers. Maybe in the future we can think of setting different LogContext for each individual container. DONE bq. In getFilteredLogFiles, the logic is that if the log file matches the include pattern, it will be added first, and if then if it matches the exclude pattern, it will be removed. Shall we do the sanity check to make sure we can not include and exclude the same pattern, otherwise, the semantics is a bit weird. Add more explanation in javaDoc. Uploaded a new patch to address all comments. > Log handling for LRS > -------------------- > > Key: YARN-2468 > URL: https://issues.apache.org/jira/browse/YARN-2468 > Project: Hadoop YARN > Issue Type: Sub-task > Components: log-aggregation, nodemanager, resourcemanager > Reporter: Xuan Gong > Assignee: Xuan Gong > Attachments: YARN-2468.1.patch, YARN-2468.2.patch, YARN-2468.3.patch, YARN-2468.3.rebase.2.patch, YARN-2468.3.rebase.patch, YARN-2468.4.1.patch, YARN-2468.4.patch, YARN-2468.5.patch > > > Currently, when application is finished, NM will start to do the log aggregation. But for Long running service applications, this is not ideal. The problems we have are: > 1) LRS applications are expected to run for a long time (weeks, months). > 2) Currently, all the container logs (from one NM) will be written into a single file. The files could become larger and larger. -- This message was sent by Atlassian JIRA (v6.3.4#6332)