hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Amar Kamat (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-3937) Job history may get disabled due to overly long job names
Date Thu, 14 Aug 2008 03:33:44 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-3937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12622431#action_12622431
] 

Amar Kamat commented on HADOOP-3937:
------------------------------------

bq. _job-history-start-time___job-id___job-name___user-name_
I missed the jobtracker's hostname in this. So the history filename looks like 
_job-history-start-time___jobtracker-hostname___job-id___job-name___user-name_

Job's id is unique within a jobtracker but not across jobtrackers although its less probable
that two tracker will start at the same time. Since running two jobtrackers on a same node
is even less probable, I think its safe to assume that _jobtracker-hostname___job-id_ should
be unique across clusters. One simple way to achieve short and unique history filenames would
be to have something like _job-id___*f*(jobtracker-hostname)_, where _*f*_(s) is something
like a hash. One can maintain the mapping of _*f*_(s) to _jobtracker-hostname_ in some _index_
file along with the username and jobname information. Thoughts?


> Job history may get disabled due to overly long job names
> ---------------------------------------------------------
>
>                 Key: HADOOP-3937
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3937
>             Project: Hadoop Core
>          Issue Type: Bug
>    Affects Versions: 0.17.0, 0.17.1, 0.18.0, 0.19.0
>            Reporter: Matei Zaharia
>         Attachments: HADOOP-3937.patch
>
>
> Since Hadoop 0.17, the job history logs include the job's name in the filename. However,
this can lead to overly long filenames, because job names may be arbitrarily long. When a
filename is too long for the underlying OS, file creation fails and the JobHistory class silently
disables history from that point on. This can lead to days of lost history until somebody
notices the error in the log.
> Proposed solution: Trim the job name to a reasonable length when selecting a filename
for the history file.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message