hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sameer Paranjpye (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-2915) mapred output files and directories should be created as the job submitter, not tasktracker or jobtracker
Date Wed, 05 Mar 2008 00:53:40 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-2915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12575211#action_12575211

Sameer Paranjpye commented on HADOOP-2915:

> It seems really error-prone having a cache that gives out FileSystems more than once
and requiring users to close them. Take for instance, the case where a user has 
> two jobs running at the same time, you can easily end up with two copies of the FileSystem
being used for different jobs. I propose that we fix this by making the 
> FileSystem cache contain weak references to the file systems and make the finializers
close the filesystem.

Will the finalizer also delete the filesystem entry from the cache? It probably should otherwise
we could end up lots of dead weak references in the cache. Need to be careful when deleting
the entry. A weak reference will start returning null before the finalizer is run. Between
the time the weak reference dies and the finalizer is run another call to get() could cause
a new filesystem object to be instantiated for the authority and user. The finalizer should
delete the entry iff the weakreference is currently returning null. It should also deal with
the entry not being present.

> mapred output files and directories should be created as the job submitter, not tasktracker
or jobtracker
> ---------------------------------------------------------------------------------------------------------
>                 Key: HADOOP-2915
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2915
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs, mapred
>    Affects Versions: 0.16.0
>            Reporter: Tsz Wo (Nicholas), SZE
>            Assignee: Tsz Wo (Nicholas), SZE
>             Fix For: 0.16.1
>         Attachments: 2915_20080229.patch, 2915_20080302.patch, 2915_20080303.patch
> Quoted from an email sending to core-dev by Andy Li:
> {quote}
> For example, assuming I have installed Hadoop with an account 'hadoop' and I am going
to run my program with user account 'test'. I have created an input folder as /user/test/input/
with user 'test' and the permission is set to 0775.
> /user/test/input      <dir>          2008-02-27 01:20 rwxr-xr-x      test  hadoop
> When I run the MapReduce code, the output I specified will be set to user 'hadoop' instead
of 'test'.
> /bin/hadoop jar /tmp/test_perm.jar -m 57 -r 3 "/user/test/input/l" "/user/test/output/"
> The directory "/user/test/output/" will have the following permission and user:group.
> /user/test/output    <dir>          2008-02-27 03:53        rwxr-xr-x hadoop  hadoop
> {quote}

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message