hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Hadoop Wiki] Update of "Hive/AuthDev" by HeYongqiang
Date Thu, 21 Oct 2010 08:11:18 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The "Hive/AuthDev" page has been changed by HeYongqiang.
http://wiki.apache.org/hadoop/Hive/AuthDev

--------------------------------------------------

New page:

= 1. Privilege =

== 1.1 Access Privilege ==

Admin privilege, DB privilege, Table level privilege, column level privilege

1.1.1 Admin privileges are global privileges, and are used to perform administration.

1.1.2 DB privileges are database specific, and apply to all objects inside that database.

1.1.3 Table privileges apply to table/view/index in a given database

1.1.4 Column privileges apply to column level.

All DB/Table/Column privilege differentiate read and write privileges even though now hive
does not support column level overwrite. And there is no partition level privilege.

= 2. Hive Operations =

create index/drop index

create database/drop database

create table/drop table

create view/drop view

alter table

show databases

lock table/unlock table/show lock

add partition

archive

Select

insert overwrite directory

insert overwrite table

others include "create table as ", "create table like" etc

= 3. Metadata =

Store the privilege information in the new metastore table 'user', 'db', ['host'], 'tables_priv',
'columns_priv'.

The user table indicates user's global privileges, which apply to all databases.
The db table determine database level access privileges, which apply to all objects inside
that database.

The hots table is used to constrain the host names from which the privileges are granted to
the given user. 
[I am not sure if we need to have this table.]


== 3.1 user, group, and roles ==

User can belong to some groups. The group information is provided by authenticator.

And each user or group can have some roles. And role can be a member of a role, but can not
in a circular.

So hive metadata needs to store:

1) roles

2) Hive user/group -> role mapping

=== 3.1.1 Role management ===

create role

drop role

add a user to a role

remove a user from a role

=== 3.1.2 role metadata ===

role_name - string

create_time - int

=== 3.1.3 hive role user membership table ===

role_name - string

user_name - string

is_group -- is the user name a group name

is_role  -- is the user name a role name


== 3.2 Privileges to be supported by Hive ==

=== 3.2.1 metadata ===

The below shows how we store the grant information in metastore. The deny information is stored
in a same matter (just in different tables). 

So for each grant table, there will also be a deny table. The metastore tables are

user, deny_user, db, deny_db, tables_priv, deny_tables_priv, columns_priv, deny_columns_priv

Another way to do it is to add a column in the grant table to record this row is grant or
deny.


We store privileges in one column, and use comma to separate different privileges.


hive> desc user;

Field 

- - - - 

User

isRole

isGroup

isSuper

db_priv -- set (Select_priv, Insert_priv, Create_priv, Drop_priv, Reload_priv, 

		Grant_priv, Index_priv, Alter_priv, Show_db_priv,

	        Lock_tables_priv, Create_view_priv, Show_view_priv)

hive> desc db;  

Field    

- - - - 

Db

User

isRole

isGroup

Table_priv  -- set (Select_priv, Insert_priv, Create_priv, Drop_priv, Grant_priv, 

                    Index_priv, Reload_priv, Alter_priv, Create_tmp_table_priv, 
 
                    Lock_tables_priv, Create_view_priv, Show_view_priv)


hive> desc tables_priv;


Field

- - - - 

Db

User

isRole

isGroup

Table_name

Grantor

Timestamp

Table_priv  -- set('Select','Insert',,'Create','Drop','Grant','Index','Alter','Create View','Show
view') 

Column_priv -- set('Select','Insert',)                                                   
                        


mysql> desc columns_priv;

Field

- - - -     

Db          

User        

isRole

isGroup

Table_name  

Column_name 

Timestamp   

Column_priv -- set('Select','Insert','Update')


= 4. grant/revoke access privilege =

== 4.1 Privilege names/types: ==

ALL Privileges

ALTER

Create

Create temporary tables

Ceate view

Delete

Drop

Index

Insert

Lock Tables

Select 

Show databases

show view

Super

Update


== 4.2 show grant ==

== 4.3 grant/revoke statement ==

GRANT
    priv_type [(column_list)]
      [, priv_type [(column_list)]] ...
    ON [object_type] priv_level
    TO user [, user] ...
WITH ADMIN OPTION

object_type:
    TABLE

priv_level:
    *
  | *.*
  | db_name.*
  | db_name.tbl_name
  | tbl_name

REVOKE
    priv_type [(column_list)]
      [, priv_type [(column_list)]] ...
    ON [object_type] priv_level
    FROM user [, user] ...

REVOKE ALL PRIVILEGES, GRANT OPTION
    FROM user [, user] ...


DENY  
	priv_type [(column_list)]
      [, priv_type [(column_list)]] ...
    ON [object_type] priv_level
    FROM user [, user] ...

= 5. Authorization verification =

== 5.1 USER/GROUP/ROLE ==

USER

GROUP

ROLE

GROUP is very similar to a role. And we support Group is because we may need to pass the group
information to HDFS/Map-reduce. But role does not need to be a group.

Role can be nested but not circular.


[
In Oracle, a role groups several privileges and roles, so that they can be granted to and
revoked from users simultaneously. A role must be enabled for a user before it can be used
by the user. And in Oracle, there is role Authorization. Create role/Drop role requires CREATE
ROLE system privilege to perform.
]

== 5.2 The verification steps ==

When a user logins to the system, he has a user name, one or few groups that he belongs to.
And he also may be granted to some roles.
So it is 

[

username, 

list of group names, 

list of roles that has been directly granted to himself, 

list of roles that been directly granted groups that users belongs to

].

First try user name:

first try to deny this access by look up the deny tables by user name:


1. If there is an entry in 'user' that deny this access, return DENY

2. If there is an entry in 'db'  that deny this access, return DENY

3. If there is an entry in 'table'  that deny this access, return DENY

4. If there is an entry in 'column'  that deny this access, return DENY



if deny failed, go through all privilege levels with the user name:


5. If there is an entry in 'user' that accept this access, return ACCEPT

6. If there is an entry in 'db'  that accept this access, return ACCEPT

7. If there is an entry in 'table'  that accept this access, return ACCEPT

8. If there is an entry in 'column'  that accept this access, return ACCEPT



Second try the user's group/role names one by one until we get an ACCEPT or DENY. If we get
one DENY from one group/role, will DENY this access. 


For each role/group, we do the same routine as we did for user name.


= 5.3 Examples =


5.3.1 I want to grant everyone (new people may join at anytime) to
db_name.*, and then later i want to protect one table db_name.T from ALL
users but a few


1) Add all users to a group 'users'. (assumption: new users will
automatically join this group). And grant 'users' ALL privileges to db_name.*

2) Add those few users to a new group 'users2'. AND REMOVE them from 'users'

3) DENY 'users' to db_name.T

4) Grant ALL on db_name.T to users2


5.3.2 I want to protect one table db_name.T from one/few users, but all
other people can access it

1) Add all users to a group 'users'. (assumption: new users will automatically
join this group). And grant 'users' ALL privileges to db_name.*.

2) Add those few users to a new group 'users2'. (Note: those few users will now
belong to 2 groups: users and user2)

3) DENY 'users2' to db_name.T


= 6. Where to add authorization in Hive =

CliDriver and HiveServer. Basically they share the same code. If HiveServer invokes CliDriver,
we can just add it into CliDriver. And we also need to make HiveServer be able to support
multiple user/connections.

= 7. Implementation =

== 7.1 Authenticator interface ==

We only get the user's user name, group names from the authenticator. The authenticator implementations
need to provide these information. This is the only interface between authenticator and authorization.

== 7.2 Authorization ==

Authorization decision manager manages a set of authorization provider, and each provider
can decide to accept or deny. And it is the decision manager to do the final decision. Can
be vote based, or one -1 then deny, or one +1 then accept. Authorization provider decides
whether to accept or deny an access based on his own information.

------------

= HDFS Permission =
The above has a STRONG assumption on the file layer security. Users can easily by-pass the
security if the hdfs file permission is open to him. We hope we can be able to plug in external
authorizations (like HDFS permission) easily to alter the authorization result or even the
rule.

Mime
View raw message