hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Doug Cutting (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-4348) Adding service-level authorization to Hadoop
Date Wed, 22 Oct 2008 22:25:44 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-4348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12641984#action_12641984

Doug Cutting commented on HADOOP-4348:

Looking at the implementation I'm reminded of HADOOP-4049.

Hadoop's RPC is implemented as two layers: Server & Client implement a simple transport
that sends a stream of serialized instances to a server, where they're processed and then
serialized instances are streamed back to the client.  RPC layers methods, parameters, etc.
on top of this.  The layering isn't perfect, but it's still worth preserving.  If we wish
to replace the transport or the RPC logic someday, then keeping the layers distinct should
simplify things.

When a change requires changes to both layers, as this patch does, that raises a red flag,
and makes me wonder if it might better be done at one level or the other, rather than spread
across both.  HADOOP-4049 started out modifying both layers, but eventually wound up only
modifying the RPC layer, and it became a simpler patch for it.

So I wonder if this might also be implemented by adding fields to Invocation, so that each
call passes the protocol name and the invoking ugi.  Then the Invoker can check these.  Wouldn't
that be simpler & contain the implementation to a single layer?

> Adding service-level authorization to Hadoop
> --------------------------------------------
>                 Key: HADOOP-4348
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4348
>             Project: Hadoop Core
>          Issue Type: New Feature
>            Reporter: Kan Zhang
>            Assignee: Arun C Murthy
>             Fix For: 0.20.0
>         Attachments: HADOOP-4348_0_20081022.patch
> Service-level authorization is the initial checking done by a Hadoop service to find
out if a connecting client is a pre-defined user of that service. If not, the connection or
service request will be declined. This feature allows services to limit access to a clearly
defined group of users. For example, service-level authorization allows "world-readable" files
on a HDFS cluster to be readable only by the pre-defined users of that cluster, not by anyone
who can connect to the cluster. It also allows a M/R cluster to define its group of users
so that only those users can submit jobs to it.
> Here is an initial list of requirements I came up with.
>     1. Users of a cluster is defined by a flat list of usernames and groups. A client
is a user of the cluster if and only if her username is listed in the flat list or one of
her groups is explicitly listed in the flat list. Nested groups are not supported.
>     2. The flat list is stored in a conf file and pushed to every cluster node so that
services can access them.
>     3. Services will monitor the modification of the conf file periodically (5 mins interval
by default) and reload the list if needed.
>     4. Checking against the flat list is done as early as possible and before any other
authorization checking. Both HDFS and M/R clusters will implement this feature.
>     5. This feature can be switched off and is off by default.
> I'm aware of interests in pulling user data from LDAP. For this JIRA, I suggest we implement
it using a conf file. Additional data sources may be supported via new JIRA's.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message