hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Wang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-4968) Provide configuration option for FileSystem symlink resolution
Date Mon, 22 Jul 2013 20:48:49 GMT

    [ https://issues.apache.org/jira/browse/HDFS-4968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13715630#comment-13715630
] 

Andrew Wang commented on HDFS-4968:
-----------------------------------

Hey Colin, thanks for the review. New patch rebased on trunk, since things have changed a
bit.

bq. this will do nothing to change the Configuration set inside the AbstractFileSystem

We get a new AFS each time we call into FileContext (there isn't any caching like there is
with FileSystem), so I think making this Configurable works.

bq. also be nice to see a FileSystem based test case

The provided test case also tests both FileContext and FileSystem. I could separate these
if you wish.

btw, I'm also going to move this to HADOOP rather than HDFS, since it's not just disabling
DFS resolution.
                
> Provide configuration option for FileSystem symlink resolution
> --------------------------------------------------------------
>
>                 Key: HDFS-4968
>                 URL: https://issues.apache.org/jira/browse/HDFS-4968
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>    Affects Versions: 3.0.0, 2.3.0
>            Reporter: Andrew Wang
>            Assignee: Andrew Wang
>         Attachments: hdfs-4968-1.patch, hdfs-4968-2.patch, hdfs-4968-3.patch
>
>
> With FileSystem symlink support incoming in HADOOP-8040, some clients will wish to not
transparently resolve symlinks. This is somewhat similar to O_NOFOLLOW in open(2).
> Rationale for is for a security model where a user can invoke a third-party service running
as a service user to operate on the user's data. For instance, users might want to use Hive
to query data in their homedirs, where Hive runs as the Hive user and the data is readable
by the Hive user. This leads to a security issue with symlinks:
> # User Mallory invokes Hive to process data files in {{/user/mallory/hive/}}
> # Hive checks permissions on the files in {{/user/mallory/hive/}} and allows the query
to proceed.
> # RACE: Mallory replaces the files in {{/user/mallory/hive}} with symlinks that point
to user Ann's Hive files in {{/user/ann/hive}}. These files aren't readable by Mallory, but
she can create whatever symlinks she wants in her own scratch directory.
> # Hive's MR jobs happily resolve the symlinks and accesses Ann's private data.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message