Return-Path: X-Original-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 220CF1748C for ; Wed, 8 Oct 2014 05:45:35 +0000 (UTC) Received: (qmail 74183 invoked by uid 500); 8 Oct 2014 05:45:34 -0000 Delivered-To: apmail-hadoop-yarn-issues-archive@hadoop.apache.org Received: (qmail 74074 invoked by uid 500); 8 Oct 2014 05:45:34 -0000 Mailing-List: contact yarn-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: yarn-issues@hadoop.apache.org Delivered-To: mailing list yarn-issues@hadoop.apache.org Received: (qmail 73782 invoked by uid 99); 8 Oct 2014 05:45:34 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 08 Oct 2014 05:45:34 +0000 Date: Wed, 8 Oct 2014 05:45:34 +0000 (UTC) From: "Zhijie Shen (JIRA)" To: yarn-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (YARN-2582) Log related CLI and Web UI changes for LRS MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/YARN-2582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14163116#comment-14163116 ] Zhijie Shen commented on YARN-2582: ----------------------------------- Thanks for the patch, Xuan! Here're some comments and thoughts. 1. This may not be sufficient. For example, "tmp_\[timestamp\].log" will become a false positive case here. We need to check whether the file name ends with TMP_FILE_SUFFIX or not. {code} && !fileName.contains(LogAggregationUtils.TMP_FILE_SUFFIX)) { {code} {code} if (!thisNodeFile.getPath().getName() .contains(LogAggregationUtils.TMP_FILE_SUFFIX)) { {code} 2. I'm thinking what the better way should be to handle the case that it fails at fetching one log file, but not the others. Thoughts? {code} AggregatedLogFormat.LogReader reader = null; try { reader = new AggregatedLogFormat.LogReader(getConf(), thisNodeFile.getPath()); if (dumpAContainerLogs(containerId, reader, System.out) > -1) { foundContainerLogs = true; } } finally { if (reader != null) { reader.close(); } } {code} 3. In fact, in the case of either single container or all containers, when the container log is not found, it is possible that 1) the container log hasn't been aggregated yet or 2) the log aggregation is not enabled. The the former case it is also possible to have the third situation: the user are providing an invalid nodeId. It's good to uniform the warning message. {code} if (!foundContainerLogs) { System.out.println("Logs for container " + containerId + " are not present in this log-file."); return -1; } {code} {code} if (! foundAnyLogs) { System.out.println("Logs not available at " + remoteAppLogDir.toString()); System.out .println("Log aggregation has not completed or is not enabled."); return -1; } {code} 4. It seems that we don't have unit tests to cover the successful path of dumping the container log(s) currently. I'm not sure if it is going to be a big addition. If it is, let's handle it separately. There're two general issues: 1. Currently we don't have the web page on RM web app to show the aggregated logs. One the other hand, LRS is supposed to be always in the running state, such that the timeline server will not present it. Therefore, we still haven't a web entrance to view the aggregated logs. 2. For LRS, the latest logs are not aggregated in the HDFS, but in the local dir of NM. When a user says he wants to check the logs of this LRS, it doesn't make sense that he can just view yesterday's log, because today's is still not uploaded to HDFS. It's good to have a combined view of both latest log on local dir of NM and aggregated logs on HDFS. Thoughts? > Log related CLI and Web UI changes for LRS > ------------------------------------------ > > Key: YARN-2582 > URL: https://issues.apache.org/jira/browse/YARN-2582 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager, resourcemanager > Reporter: Xuan Gong > Assignee: Xuan Gong > Attachments: YARN-2582.1.patch > > > After YARN-2468, we have change the log layout to support log aggregation for Long Running Service. Log CLI and related Web UI should be modified accordingly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)