Return-Path: Delivered-To: apmail-hadoop-core-dev-archive@www.apache.org Received: (qmail 17135 invoked from network); 15 Apr 2008 04:48:11 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 15 Apr 2008 04:48:11 -0000 Received: (qmail 49861 invoked by uid 500); 15 Apr 2008 04:48:09 -0000 Delivered-To: apmail-hadoop-core-dev-archive@hadoop.apache.org Received: (qmail 49832 invoked by uid 500); 15 Apr 2008 04:48:09 -0000 Mailing-List: contact core-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: core-dev@hadoop.apache.org Delivered-To: mailing list core-dev@hadoop.apache.org Received: (qmail 49823 invoked by uid 99); 15 Apr 2008 04:48:09 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 14 Apr 2008 21:48:09 -0700 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 15 Apr 2008 04:47:26 +0000 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id 1CB42234C0CA for ; Mon, 14 Apr 2008 21:45:05 -0700 (PDT) Message-ID: <900582728.1208234705116.JavaMail.jira@brutus> Date: Mon, 14 Apr 2008 21:45:05 -0700 (PDT) From: "Amareshwari Sriramadasu (JIRA)" To: core-dev@hadoop.apache.org Subject: [jira] Commented: (HADOOP-3256) JobHistory file on HDFS should not use the 'job name' In-Reply-To: <2123979905.1208221266408.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HADOOP-3256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12588897#action_12588897 ] Amareshwari Sriramadasu commented on HADOOP-3256: ------------------------------------------------- Changing the log file name is going to use only jobId breaks history. History jsps assume that they have the path name as jobtrackerHostname_jobId_username_jobName. Since there is no master index now, the history viewing jsps do a list paths in history directory and parse the path names to give more information for the user about the log file. (For more information see HADOOP-2178 ) I could see the following options: 1. One solution could be to write this information as the first line of history log, but the jsps have to read first line of all history files to print the first page. This is going to be lot of reads. 2. Another solution could be to construct a accepted path with the jobtrackerHostname, jobId, username and jobName. Thoughts? > JobHistory file on HDFS should not use the 'job name' > ----------------------------------------------------- > > Key: HADOOP-3256 > URL: https://issues.apache.org/jira/browse/HADOOP-3256 > Project: Hadoop Core > Issue Type: Bug > Components: mapred > Affects Versions: 0.17.0 > Reporter: Arun C Murthy > Assignee: Arun C Murthy > Priority: Blocker > Fix For: 0.17.0 > > Attachments: HADOOP-3256_0_20080414.patch > > > HADOOP-2178 introduced the feature of saving jobhistory logs on HDFS. > Unfortunately the following code: > {noformat} > // setup the history log file for this job > String logFileName = jobUniqueString + > "_" + user+ "_" + jobName; > if (logFileName.length() > MAX_FILENAME_SIZE) { > logFileName = logFileName.substring(0, MAX_FILENAME_SIZE-1); > } > {noformat} > is vulnerable to user-provided job names. > Specifically I ran into 'URISyntaxException' with jobs whose names include a ":". > The easy fix is to ensure that we do not use the human-friendly job names and only the jobid. > The long term fix is to ensure that Path handles filenames with _any_ characters. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.