hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Amareshwari Sriramadasu (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-3256) JobHistory file on HDFS should not use the 'job name'
Date Tue, 15 Apr 2008 04:45:05 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-3256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12588897#action_12588897

Amareshwari Sriramadasu commented on HADOOP-3256:

Changing the log file name is going to use only jobId breaks history. 
History jsps assume that they have the path name as jobtrackerHostname_jobId_username_jobName.
 Since there is no master index now, the history viewing jsps do a list paths in history directory
and  parse the path names to give more information for the user about the log file.
(For more information see HADOOP-2178 )
I could see the following options:
1. One solution could be to write this information as the first line of history log, but the
jsps have to read first line of all history files to print the first page.
This is going to be lot of reads. 
2. Another solution could be to construct a accepted path with the jobtrackerHostname, jobId,
username and jobName.


> JobHistory file on HDFS should not use the 'job name'
> -----------------------------------------------------
>                 Key: HADOOP-3256
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3256
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Arun C Murthy
>            Assignee: Arun C Murthy
>            Priority: Blocker
>             Fix For: 0.17.0
>         Attachments: HADOOP-3256_0_20080414.patch
> HADOOP-2178 introduced the feature of saving jobhistory logs on HDFS.
> Unfortunately the following code:
> {noformat}
>         // setup the history log file for this job
>         String logFileName = jobUniqueString +  
>                              "_" + user+ "_" + jobName;
>         if (logFileName.length() > MAX_FILENAME_SIZE) {
>           logFileName = logFileName.substring(0, MAX_FILENAME_SIZE-1);
>         }
> {noformat}
> is vulnerable to user-provided job names. 
> Specifically I ran into 'URISyntaxException' with jobs whose names include a ":".
> The easy fix is to ensure that we do not use the human-friendly job names and only the
> The long term fix is to ensure that Path handles filenames with _any_ characters.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message