hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Owen O'Malley (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-2915) mapred output files and directories should be created as the job submitter, not tasktracker or jobtracker
Date Tue, 04 Mar 2008 22:49:40 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-2915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12575175#action_12575175

Owen O'Malley commented on HADOOP-2915:

You can not use UserGroupInformation as the the key in a hash table, because it has no hash
function or equals defined. I would recommend using the user name instead, since String has
all of the important methods defined.

In general abbreviations in variable names are hard to read and "saToFs" is pretty opaque.

The finally should go on the same line as the closing brace.

It seems really error-prone having a cache that gives out FileSystems more than once and requiring
users to close them. Take for instance, the case where a user has two jobs running at the
same time, you can easily end up with two copies of the FileSystem being used for different
jobs. I propose that we fix this by making the FileSystem cache contain weak references to
the file systems and make the finializers close the filesystem.

The JobTracker should keep a reference to the output file system in JobInProgress and continue
to use it over again. When the job completes, the fields should be cleared by JobTracker.markCompletedJob,
so that it can be reclaimed by the garbage collector.

> mapred output files and directories should be created as the job submitter, not tasktracker
or jobtracker
> ---------------------------------------------------------------------------------------------------------
>                 Key: HADOOP-2915
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2915
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs, mapred
>    Affects Versions: 0.16.0
>            Reporter: Tsz Wo (Nicholas), SZE
>            Assignee: Tsz Wo (Nicholas), SZE
>             Fix For: 0.16.1
>         Attachments: 2915_20080229.patch, 2915_20080302.patch, 2915_20080303.patch
> Quoted from an email sending to core-dev by Andy Li:
> {quote}
> For example, assuming I have installed Hadoop with an account 'hadoop' and I am going
to run my program with user account 'test'. I have created an input folder as /user/test/input/
with user 'test' and the permission is set to 0775.
> /user/test/input      <dir>          2008-02-27 01:20 rwxr-xr-x      test  hadoop
> When I run the MapReduce code, the output I specified will be set to user 'hadoop' instead
of 'test'.
> /bin/hadoop jar /tmp/test_perm.jar -m 57 -r 3 "/user/test/input/l" "/user/test/output/"
> The directory "/user/test/output/" will have the following permission and user:group.
> /user/test/output    <dir>          2008-02-27 03:53        rwxr-xr-x hadoop  hadoop
> {quote}

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message