hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Carl Steinbach (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HIVE-78) Authorization infrastructure for Hive
Date Fri, 22 Oct 2010 01:44:18 GMT

    [ https://issues.apache.org/jira/browse/HIVE-78?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12923733#action_12923733
] 

Carl Steinbach commented on HIVE-78:
------------------------------------

The issue that Todd raised is pretty important and needs to be addressed in the proposal.
My personal opinion is that running all queries as a "hive" super-user is the most
practical approach and will also yield behavior that is familiar to users of traditional
RDBMS systems (who I expect will increasingly define the average Hive user/administrator).

There are some other follow-on issues that need to be decided if we end up settling
on this approach:

* This approach to authorization presupposes that users are accessing Hive through a HiveServer
process. This follows from the fact that A) you want Hive to execute the query plans as the
Hive superuser, and B) that user can circumvent the authorization model if they are given
direct access to the MetaStore DB. It would be nice if the proposal explicitly stated this
requirement and mentioned some of the follow-on work that this necessitates, e.g. fixing concurrency
issues in HiveServer, reducing the memory requirements of HiveServer, etc.

* We need to apply the authorization model to the '{{add [archive|file|jar]}}' commands as
well as {{add temorary function}}. {{add jar}} and {{add file}} both currently allow the user
to inject code into MR jobs, and {{add jar}} in conjunction with {{add temporary function}}
allows the user to inject and execute arbitrary code within the HiveServer process. We may
also want to add a new {{add executable}} command for adding executable scripts that has a
different permission model than {{add file}}.

* I think there also may be security issues stemming from external tables, e.g. if I create
an external table that points to another user's home directory and then run a query on it
which executes with Hive's superuser permissions.

* Loading date into the Hive warehouse from an arbitrary HDFS location and exporting data
to other locations in HDFS are two issues that need to be considered. In each case I think
the correct behavior depends on both the Hive process's permissions and those of the user.




> Authorization infrastructure for Hive
> -------------------------------------
>
>                 Key: HIVE-78
>                 URL: https://issues.apache.org/jira/browse/HIVE-78
>             Project: Hive
>          Issue Type: New Feature
>          Components: Metastore, Query Processor, Server Infrastructure
>            Reporter: Ashish Thusoo
>            Assignee: He Yongqiang
>         Attachments: createuser-v1.patch, hive-78-metadata-v1.patch, hive-78-syntax-v1.patch,
hive-78.diff
>
>
> Allow hive to integrate with existing user repositories for authentication and authorization
infromation.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message