hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sanjay Radia (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-4348) Adding service-level authorization to Hadoop
Date Thu, 23 Oct 2008 01:00:48 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-4348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12642024#action_12642024

Sanjay Radia commented on HADOOP-4348:

Strictly, rpc does not imply any notion of a connection. One could have a connection per rpc
The sharing of connection is merely an optimization.

I think the question you have raised is deeper than at which layer to implement service authorization.

We already have method level authorization in HDFS - each method implements its own authorization
against the file object being accessed.
This Jira is proposing to added service level authorization.
The notion of service level authorization (as opposed to rpc method level authorization) implies
that one is performing authorization for a session of
rpc calls. Since strictly speaking rpc has no notion of a session as stated above, one  *could*
argue that,  from a layering point of view,  service level authorization does not make sense
for rpc in the first place. 
But conceptually we *do* want service level authorization (as in a set of users are allowed
to access methods in a particular service).
 The best way to represent that service access is when a service proxy object is created -
e.g when the connection is established. 
We could share multiple service sessions in a single connection but that complexity is not
worth it.

So Doug, I see your argument to be equivalent to  arguing against service level authorization
and that method level authorization is sufficient.
Clearly method level is sufficient. I feel service level is useful (even though not necessary)
and that it is best captured below the method level invokation. In the current impl that is
at the ipc layer. When you created the ipc layer did you envision multiple upper layers (such
as rpc, reliable data grams, streaming, etc) to share a single instance of the ipc layer?
If  you did then I can perhaps understand your point of view. Would you be happier if we created
an intermediate layer, say rpc-session, in between. I am not seriously suggesting we do that.
But if conceptually that would make you happier then simply assume that we have  decided to
merely put one session per connection. 

Does the above help or further confuses the matter?

> Adding service-level authorization to Hadoop
> --------------------------------------------
>                 Key: HADOOP-4348
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4348
>             Project: Hadoop Core
>          Issue Type: New Feature
>            Reporter: Kan Zhang
>            Assignee: Arun C Murthy
>             Fix For: 0.20.0
>         Attachments: HADOOP-4348_0_20081022.patch
> Service-level authorization is the initial checking done by a Hadoop service to find
out if a connecting client is a pre-defined user of that service. If not, the connection or
service request will be declined. This feature allows services to limit access to a clearly
defined group of users. For example, service-level authorization allows "world-readable" files
on a HDFS cluster to be readable only by the pre-defined users of that cluster, not by anyone
who can connect to the cluster. It also allows a M/R cluster to define its group of users
so that only those users can submit jobs to it.
> Here is an initial list of requirements I came up with.
>     1. Users of a cluster is defined by a flat list of usernames and groups. A client
is a user of the cluster if and only if her username is listed in the flat list or one of
her groups is explicitly listed in the flat list. Nested groups are not supported.
>     2. The flat list is stored in a conf file and pushed to every cluster node so that
services can access them.
>     3. Services will monitor the modification of the conf file periodically (5 mins interval
by default) and reload the list if needed.
>     4. Checking against the flat list is done as early as possible and before any other
authorization checking. Both HDFS and M/R clusters will implement this feature.
>     5. This feature can be switched off and is off by default.
> I'm aware of interests in pulling user data from LDAP. For this JIRA, I suggest we implement
it using a conf file. Additional data sources may be supported via new JIRA's.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message