hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tsz Wo (Nicholas), SZE (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-2915) mapred output files and directories should be created as the job submitter, not tasktracker or jobtracker
Date Fri, 29 Feb 2008 02:10:51 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-2915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12573563#action_12573563
] 

Tsz Wo (Nicholas), SZE commented on HADOOP-2915:
------------------------------------------------

This problem is due to FileSystem.CACHE, which  stores FileSystem objects accroding to the
schemes and authorities, but not user information.  

Suppose user foo has called FileSystem.getFileSystem(...) with some uri U.
User bar calls getFileSystem(...) with another uri V, where the schemes and authorities of
U and V are the same.
Then, getFileSystem(...) returns the FileSystem with foo's account to bar.

In this problem, JobTracker has called getFileSystem(...) at first.  Then, some task calls
getFileSystem(...).  It turn out gets a FileSystem with JobTracker's account.
The user information of the task stored in jobConf is not used because of FileSystem.CACHE.

Possible solutions:
- In addition of schemes and authorities, add user information to the keys of FileSystem.CACHE.

- A cached FileSystem object will be removed from FileSystem.CACHE when calling FileSystem.close().
 So, we close each FileSystem object before another FileSystem is opened.  However, this might
affect performance.

I guess the first one is better.


> mapred output files and directories should be created as the job submitter, not tasktracker
or jobtracker
> ---------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-2915
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2915
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.0
>            Reporter: Tsz Wo (Nicholas), SZE
>            Assignee: Tsz Wo (Nicholas), SZE
>
> Quoted from an email sending to core-dev by Andy Li:
> {quote}
> For example, assuming I have installed Hadoop with an account 'hadoop' and I am going
to run my program with user account 'test'. I have created an input folder as /user/test/input/
with user 'test' and the permission is set to 0775.
> /user/test/input      <dir>          2008-02-27 01:20 rwxr-xr-x      test  hadoop
> When I run the MapReduce code, the output I specified will be set to user 'hadoop' instead
of 'test'.
> /bin/hadoop jar /tmp/test_perm.jar -m 57 -r 3 "/user/test/input/l" "/user/test/output/"
> The directory "/user/test/output/" will have the following permission and user:group.
> /user/test/output    <dir>          2008-02-27 03:53        rwxr-xr-x hadoop  hadoop
> {quote}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message