hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stephen O'Donnell (Jira)" <j...@apache.org>
Subject [jira] [Updated] (HDFS-15372) Files in snapshots no longer see attribute provider permissions
Date Tue, 09 Jun 2020 11:24:00 GMT

     [ https://issues.apache.org/jira/browse/HDFS-15372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Stephen O'Donnell updated HDFS-15372:
-------------------------------------
    Attachment: HDFS-15372.004.patch

> Files in snapshots no longer see attribute provider permissions
> ---------------------------------------------------------------
>
>                 Key: HDFS-15372
>                 URL: https://issues.apache.org/jira/browse/HDFS-15372
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: Stephen O'Donnell
>            Assignee: Stephen O'Donnell
>            Priority: Major
>         Attachments: HDFS-15372.001.patch, HDFS-15372.002.patch, HDFS-15372.003.patch,
HDFS-15372.004.patch
>
>
> Given a cluster with an authorization provider configured (eg Sentry) and the paths covered
by the provider are snapshotable, there was a change in behaviour in how the provider permissions
and ACLs are applied to files in snapshots between the 2.x branch and Hadoop 3.0.
> Eg, if we have the snapshotable path /data, which is Sentry managed. The ACLs below are
provided by Sentry:
> {code}
> hadoop fs -getfacl -R /data
> # file: /data
> # owner: hive
> # group: hive
> user::rwx
> group::rwx
> other::--x
> # file: /data/tab1
> # owner: hive
> # group: hive
> user::rwx
> group::---
> group:flume:rwx
> user:hive:rwx
> group:hive:rwx
> group:testgroup:rwx
> mask::rwx
> other::--x
> /data/tab1
> {code}
> After taking a snapshot, the files in the snapshot do not see the provider permissions:
> {code}
> hadoop fs -getfacl -R /data/.snapshot
> # file: /data/.snapshot
> # owner: 
> # group: 
> user::rwx
> group::rwx
> other::rwx
> # file: /data/.snapshot/snap1
> # owner: hive
> # group: hive
> user::rwx
> group::rwx
> other::--x
> # file: /data/.snapshot/snap1/tab1
> # owner: hive
> # group: hive
> user::rwx
> group::rwx
> other::--x
> {code}
> However pre-Hadoop 3.0 (when the attribute provider etc was extensively refactored) snapshots
did get the provider permissions.
> The reason is this code in FSDirectory.java which ultimately calls the attribute provider
and passes the path we want permissions for:
> {code}
>   INodeAttributes getAttributes(INodesInPath iip)
>       throws IOException {
>     INode node = FSDirectory.resolveLastINode(iip);
>     int snapshot = iip.getPathSnapshotId();
>     INodeAttributes nodeAttrs = node.getSnapshotINode(snapshot);
>     UserGroupInformation ugi = NameNode.getRemoteUser();
>     INodeAttributeProvider ap = this.getUserFilteredAttributeProvider(ugi);
>     if (ap != null) {
>       // permission checking sends the full components array including the
>       // first empty component for the root.  however file status
>       // related calls are expected to strip out the root component according
>       // to TestINodeAttributeProvider.
>       byte[][] components = iip.getPathComponents();
>       components = Arrays.copyOfRange(components, 1, components.length);
>       nodeAttrs = ap.getAttributes(components, nodeAttrs);
>     }
>     return nodeAttrs;
>   }
> {code}
> The line:
> {code}
> INode node = FSDirectory.resolveLastINode(iip);
> {code}
> Picks the last resolved Inode and if you then call node.getPathComponents, for a path
like '/data/.snapshot/snap1/tab1' it will return /data/tab1. It resolves the snapshot path
to its original location, but its still the snapshot inode.
> However the logic passes 'iip.getPathComponents' which returns "/user/.snapshot/snap1/tab"
to the provider.
> The pre Hadoop 3.0 code passes the inode directly to the provider, and hence it only
ever sees the path as "/user/data/tab1".
> It is debatable which path should be passed to the provider - /user/.snapshot/snap1/tab
or /data/tab1 in the case of snapshots. However as the behaviour has changed I feel we should
ensure the old behaviour is retained.
> It would also be fairly easy to provide a config switch so the provider gets the full
snapshot path or the resolved path.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message