hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sanjay Radia (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-4348) Adding service-level authorization to Hadoop
Date Thu, 23 Oct 2008 18:56:44 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-4348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12642230#action_12642230

Sanjay Radia commented on HADOOP-4348:

Doug says:
Sanjay> The best way to represent that service access is when a service proxy object is
created - e.g when the connection is established.
>A proxy is not bound to a single connection. Connections are retrieved from a cache each
time a call is made. Different proxies may share the same connection, and a single proxy my
use different connections for different calls.

Good point,  I missed that.
But a proxy object still represents a session that can potentially need authentication and
service level authorization.

Sanjay> I see your argument to be equivalent to arguing against service level authorization
and that method level authorization is sufficient.
Doug> No, but we will eventually probably need method-level authorization too, and it would
be nice if whatever support we add now also helps then. If we do this in RPC, then we can
examine only the protocol name for now, and subsequently add method-level authorization at
the same place. So implementing service-level-authentication this way better prepares us for
method-level authentication.

We already have method level authorization in HDFS (ie permissions checking). Doing it inside
the rpc method invocation would not work since it is very specific to the method in question
- one has to check the permissions along the path. Thus in general method level authorization
is best done inside the *implementation* of each of the methods and not the rpc layer.

I completely agree with you that the session layer (if we choose to create one) should be
done in *one of the two layer* (ipc or rpc) but *not both*.
In my earlier comment I pointed out a way to fix arun's patch to do it at one layer.
But I do agree that it may be worth creating a session layer explicitly to support this rather
than just put the code in one of the two layers. 

Service-level authorization can be done in one of 3 places:
# In the method implementation (forces every method to remember to do it).
# in rpc invocation layer (as you suggest).
# in a session layer below

But if this Jira is indeed service-level semantics rather than method-level semantics we should
strongly consider creating the session layer since authentication will also need the session
layer;   doing authentication on a per-rpc is very very expensive - not incorrect but expensive.
Authentication is often  requires a challenge response. Further there may be encryption negotiation.
While this can be done on a per-rpc it is best done at a session level. I will let Kan give
us some details on this. But I am merely arguing for some sort of session level context. Authentication
is one use case for it. Service authorization is another.

Now here is where the clean abstractions get messy:
I suspect that if we want to take advantage of the java security  APIs such as GSS (Kan please
correct me here as I am only casually familiar with the security APIs), then we may want to
have one session per connection (not necessary but simpler). This would break your original
vision for sharing an ipc instance across many upper layer instances. I think we rework some
of the interfaces to allow an upper layer to say that it wants a dedicated ipc-layer instance.
We can explore that further if we are in agreement on the rest.

> Adding service-level authorization to Hadoop
> --------------------------------------------
>                 Key: HADOOP-4348
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4348
>             Project: Hadoop Core
>          Issue Type: New Feature
>            Reporter: Kan Zhang
>            Assignee: Arun C Murthy
>             Fix For: 0.20.0
>         Attachments: HADOOP-4348_0_20081022.patch, jaas_service_v1.patch
> Service-level authorization is the initial checking done by a Hadoop service to find
out if a connecting client is a pre-defined user of that service. If not, the connection or
service request will be declined. This feature allows services to limit access to a clearly
defined group of users. For example, service-level authorization allows "world-readable" files
on a HDFS cluster to be readable only by the pre-defined users of that cluster, not by anyone
who can connect to the cluster. It also allows a M/R cluster to define its group of users
so that only those users can submit jobs to it.
> Here is an initial list of requirements I came up with.
>     1. Users of a cluster is defined by a flat list of usernames and groups. A client
is a user of the cluster if and only if her username is listed in the flat list or one of
her groups is explicitly listed in the flat list. Nested groups are not supported.
>     2. The flat list is stored in a conf file and pushed to every cluster node so that
services can access them.
>     3. Services will monitor the modification of the conf file periodically (5 mins interval
by default) and reload the list if needed.
>     4. Checking against the flat list is done as early as possible and before any other
authorization checking. Both HDFS and M/R clusters will implement this feature.
>     5. This feature can be switched off and is off by default.
> I'm aware of interests in pulling user data from LDAP. For this JIRA, I suggest we implement
it using a conf file. Additional data sources may be supported via new JIRA's.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message