hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vinod K V (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-4493) Localized files from DistributedCache should have right access-control
Date Tue, 07 Jul 2009 06:12:14 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-4493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727939#action_12727939

Vinod K V commented on HADOOP-4493:

*This issue's focus*

Given the state of the art,  what do we mean by restrictive permissions for files in distributed

*Case I: The cache files are completely private to the job-owner*
 - The job owner wants his/her files only for himself/herself and doesn't want access to anyone
 - This means 700 permissions and owned by the job owner, so only the job owner can access
 - To facilitate the above, the cache should be localized under a directory with user name.
 - The files should be usable by subsequent jobs of the same user.
 - The configurable size limit of the cache or in other words the disk quota for the cache
files should be per user.
 - Because the files are owned by the user, we will need a task-controller process launch
during cleaning-up/purging.

*Case II: The job owner is fine with sharing his/her distributed cache files with other users*
 - A possible use case for this, as mentioned on HADOOP-4490, is a multiple of users, perhaps
working on the same project that requires the same data files, want to share these files across
multiple jobs of possibly other users.
 - The above would mean _55 permissions for everyone.
 - The files should be localized under a common directory not associated to any user name.
 - There should be a configuration knob(per cache URI?) to be specified by the user so as
to distinguish these files from the ones that belong to CASE I.
 - The disk quota for the cache files should be a global one and not specific to the user.
 -  The files can be owned by the job owner or the TT itself.
    -- Having TT own the files makes it easy for cleaning up.
    -- If the ownership is with the job-owner, we would need a task-controller process launch.
We we also need user information with the files, perhaps through a sub-directory associated
with a user name.

So, do we want only CASE I or only CASE II or a combination of both? Thoughts/comments?

> Localized files from DistributedCache should have right access-control
> ----------------------------------------------------------------------
>                 Key: HADOOP-4493
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4493
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: mapred, security
>            Reporter: Arun C Murthy

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message