Return-Path: Delivered-To: apmail-lucene-hadoop-dev-archive@locus.apache.org Received: (qmail 28766 invoked from network); 12 Oct 2006 10:58:23 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 12 Oct 2006 10:58:23 -0000 Received: (qmail 33041 invoked by uid 500); 12 Oct 2006 10:58:23 -0000 Delivered-To: apmail-lucene-hadoop-dev-archive@lucene.apache.org Received: (qmail 33017 invoked by uid 500); 12 Oct 2006 10:58:23 -0000 Mailing-List: contact hadoop-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hadoop-dev@lucene.apache.org Delivered-To: mailing list hadoop-dev@lucene.apache.org Received: (qmail 33008 invoked by uid 99); 12 Oct 2006 10:58:22 -0000 Received: from asf.osuosl.org (HELO asf.osuosl.org) (140.211.166.49) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 12 Oct 2006 03:58:22 -0700 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests= X-Spam-Check-By: apache.org Received: from [209.237.227.198] (HELO brutus.apache.org) (209.237.227.198) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 12 Oct 2006 03:58:22 -0700 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id E11457142E4 for ; Thu, 12 Oct 2006 03:57:39 -0700 (PDT) Message-ID: <15932754.1160650659918.JavaMail.jira@brutus> Date: Thu, 12 Oct 2006 03:57:39 -0700 (PDT) From: "Arun C Murthy (JIRA)" To: hadoop-dev@lucene.apache.org Subject: [jira] Commented: (HADOOP-489) Seperating user logs from system logs in map reduce In-Reply-To: <30146384.1156876403539.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N [ http://issues.apache.org/jira/browse/HADOOP-489?page=comments#action_12441697 ] Arun C Murthy commented on HADOOP-489: -------------------------------------- Here are some alternatives to consider vis-a-vis the log-rotatations: 1. Custom Capture the stdout/stderr from the map/reduce task (TaskRunner.logStream) as is today and then write them out to local disk keeping the 'limits' in mind and rolling over the logs via custom code. 2. Use our logging framework (commons-logging + log4j) There are 2 alternatives here: * Option 1 The exec'ed (map/reduce) task has a custom log4j properties (passed as a system property for the exec'ed jvm) i.e. configure the task's logger with the necessary RollingFileAppender (i.e. file size limit, no of files to retain etc.). This means we have (almost) no custom code for implementing the rollover etc. In the long run we can also configure the task's logger-appender to do more complicated things like copy log fragments over to dfs instead of discarding them etc.; again without too much fuss - just via configuration. The flip-side is that we will force the application writer to use the commons-logging framework for all his logging if he wants them to be available separately via the UI etc. The user will either have to create a "org.apache.hadoop.mapred.task.id" logger or we will need to add a Mapper.getLogger() method to let the user get hold of the configured logger. * Option 2 To get around the limitation that we force apps to use commons-logging framework for all logging, we could capture the stdout/stderr via TaskRunner.logStream and then use a custom logger in TaskRunner itself to log both stdout/stderr in logStream. Again this will ensure we can customise this logger to copy logs to dfs etc. in the long run... The flip-side is that it might be impossible to re-configure the custom logger in TaskRunner with the per-job limits of user-log-file-size etc. via the commons-logging framework and hence we might need to depend on log4j itself, introducing a new dependency on log4j. I will continue to explore if there are ways of re-configuring the TaskRunner's task-specific-logger via commons-logging itself... (any one has done this before?) Thoughts? > Seperating user logs from system logs in map reduce > --------------------------------------------------- > > Key: HADOOP-489 > URL: http://issues.apache.org/jira/browse/HADOOP-489 > Project: Hadoop > Issue Type: Improvement > Components: mapred > Reporter: Mahadev konar > Assigned To: Arun C Murthy > Priority: Minor > > Currently the user logs are a part of system logs in mapreduce. Anything logged by the user is logged into the tasktracker log files. This create two issues- > 1) The system log files get cluttered with user output. If the user outputs a large amount of logs, the system logs need to be cleaned up pretty often. > 2) For the user, it is difficult to get to each of the machines and look for the logs his/her job might have generated. > I am proposing three solutions to the problem. All of them have issues with it - > Solution 1. > Output the user logs on the user screen as part of the job submission process. > Merits- > This will prevent users from printing large amount of logs and the user can get runtime feedback on what is wrong with his/her job. > Issues - > This proposal will use the framework bandwidth while running jobs for the user. The user logs will need to pass from the tasks to the tasktrackers, from the tasktrackers to the jobtrackers and then from the jobtrackers to the jobclient using a lot of framework bandwidth if the user is printing out too much data. > Solution 2. > Output the user logs onto a dfs directory and then concatenate these files. Each task can create a file for the output in the log direcotyr for a given user and jobid. > Issues - > This will create a huge amount of small files in DFS which later can be concatenated into a single file. Also there is this issue that who would concatenate these files into a single file? This could be done by the framework (jobtracker) as part of the cleanup for the jobs - might stress the jobtracker. > > Solution 3. > Put the user logs into a seperate user log file in the log directory on each tasktrackers. We can provide some tools to query these local log files. We could have commands like for jobid j and for taskid t get me the user log output. These tools could run as a seperate map reduce program with each map grepping the user log files and a single recude aggregating these logs in to a single dfs file. > Issues- > This does sound like more work for the user. Also, the output might not be complete since a tasktracker might have went down after it ran the job. > Any thoughts? -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira