hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ryan Harris <Ryan.Har...@zionsbancorp.com>
Subject RE: Best Hive Authorization Model for Shared data
Date Tue, 12 Apr 2016 23:33:06 GMT
if your only problem with #2 is the issue of creating the external table, you should be able
to throw together a script running as a more privileged user that could handle the task of
creating the external table.  Once the table is created, the user should be able to access
the read-only data.

From: Udit Mehta [mailto:umehta@groupon.com]
Sent: Tuesday, April 12, 2016 4:42 PM
To: user@hive.apache.org
Subject: Best Hive Authorization Model for Shared data

Hi all,
I wanted to understand what authorization model is most suitable for a production environment
where most of the data is shared between multiple teams and users.
I know this is would depend more on the use case but I cant seem to figure out the best model
for our use:

We have data that is owned by a certain process (R/W access for that user) while other users
only have Read access to that data. We have a lot of instances when users would want to create
external tables pointing to this data. We tried the following 3 auth models:
1. Default Authorization model: This we think is less secure and any user can grant himself
access to create/modify tables and databases even where they are not supposed to. We would
want to have much tighter security than this model provides.
2. Storage Based Authorization: While this helps us by preventing users from modifying metadata
by checking the HDFS permissions of the underlying directories, it prevents our most important
use case of letting users create external tables on data they dont have write access to. I
would assume external tables wont actually delete the data when dropping tables/partitions
so this operation should be allowed. But because it is not, even this authorization model
does not meet our use case.
3. Sql Standard Based Authorization: This does give us fine-grained control over which users
can perform specific commands, but when it comes to creating external tables, even this authorization
scheme seems to use the filesystem's permissions.
So overall all 3 models didnt seem to fulfill our requirement here which I think would be
a fairly common one. I want to know how other users manage security on Hive or If i am missing
something.
Thanks in advance,
Udit

======================================================================
THIS ELECTRONIC MESSAGE, INCLUDING ANY ACCOMPANYING DOCUMENTS, IS CONFIDENTIAL and may contain
information that is privileged and exempt from disclosure under applicable law. If you are
neither the intended recipient nor responsible for delivering the message to the intended
recipient, please note that any dissemination, distribution, copying or the taking of any
action in reliance upon the message is strictly prohibited. If you have received this communication
in error, please notify the sender immediately.  Thank you.
Mime
View raw message