hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ivan Mitic (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-8731) Public distributed cache support for Windows
Date Fri, 14 Sep 2012 19:16:08 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-8731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13456050#comment-13456050

Ivan Mitic commented on HADOOP-8731:

Thanks for reviewing Vinod!

bq. In your java comment for ancestorsHaveExecutePermissions(), please also mention that this
change is only needed to enable LocalJobRunner to use Public-dist-cache. I'd also like the
subject of this ticket to be changed - "Public dist-cache support for LocalJobRunner on Windows"
The change does not apply to LocalJobRunner only, but to distributed cache in general. I tried
to explain what is the problem and how am I trying to solve it above, let me know if you need
additional clarification.

bq. The changes involving "FileUtil.chmod()" look spurious, can you explain those changes?
Bikas asked the same question above :) Quoting my answer: {quote} The issue is that the right
permissions are not set on files if I do not make this change. If you take a look at the previous
FileUtils.chmod() it only sets permissions for archives, but not for files. Now when I moved
it below, it sets the permissions for both files are archives. {quote}

Let me know if you have additional questions/comments.
> Public distributed cache support for Windows
> --------------------------------------------
>                 Key: HADOOP-8731
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8731
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: filecache
>            Reporter: Ivan Mitic
>            Assignee: Ivan Mitic
>         Attachments: HADOOP-8731-PublicCache.patch
> A distributed cache file is considered public (sharable between MR jobs) if OTHER has
read permissions on the file and +x permissions all the way up in the folder hierarchy. By
default, Windows permissions are mapped to "700" all the way up to the drive letter, and it
is unreasonable to ask users to change the permission on the whole drive to make the file
public. IOW, it is hardly possible to have public distributed cache on Windows. 
> To enable the scenario and make it more "Windows friendly", the criteria on when a file
is considered public should be relaxed. One proposal is to check whether the user has given
EVERYONE group permission on the file only (and discard the +x check on parent folders).
> Security considerations for the proposal: Default permissions on Unix platforms are usually
"775" or "755" meaning that OTHER users can read and list folders by default. What this also
means is that Hadoop users have to explicitly make the files private in order to make them
private in the cluster (please correct me if this is not the case in real life!). On Windows,
default permissions are "700". This means that by default all files are private. In the new
model, if users want to make them public, they have to explicitly add EVERYONE group permissions
on the file. 
> TestTrackerDistributedCacheManager fails because of this issue.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message