hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dick King (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-323) Improve the way job history files are managed
Date Wed, 02 Jun 2010 19:08:47 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12874745#action_12874745

Dick King commented on MAPREDUCE-323:

It has been correctly pointed out to me that the syntax, {{%xi,j}} , is just wierd.

In keeping with java data format conventions, I will use {{%leftmost_index$width.precision
x}} to name a segment of the jobID index.

{{leftmost_index}} names the leftmost digit and {{width}} names the number of digits; {{width}}
defaults to 1.  If you name a digit that doesn't exist, the output gets the empty string in
the corresponding position [except as specified in {{precision}}, below].  It is an error
for {{width}} to exceed {{leftmost_index}} .

{{precision}} is a minimum number of digits to output; defaults to 0.  It is an error for
{{precision}} to exceed {{width}}.  If {{precision}} requires more digits than exist in the
index, we supply zeroes.

It is an error to omit {{leftmost_index}}.  It is an error to code a {{$}} if there is no
width.  It is an error to code a {{.}} if there is no {{precision}}.  It is an error to omit
{{width}} if there is a precision.

This configuration variable lives in {{mapreduce.jobhistory.completed.subdirectory.format}}
.  Default is the empty string [which gives the behavior that we get now; no subdirectories].

> Improve the way job history files are managed
> ---------------------------------------------
>                 Key: MAPREDUCE-323
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-323
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: jobtracker
>    Affects Versions: 0.21.0, 0.22.0
>            Reporter: Amar Kamat
>            Assignee: Dick King
>            Priority: Critical
> Today all the jobhistory files are dumped in one _job-history_ folder. This can cause
problems when there is a need to search the history folder (job-recovery etc). It would be
nice if we group all the jobs under a _user_ folder. So all the jobs for user _amar_ will
go in _history-folder/amar/_. Jobs can be categorized using various features like _jobid,
date, jobname_ etc but using _username_ will make the search much more efficient and also
will not result into namespace explosion. 

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message