hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alan Gates <>
Subject Re: [howldev] RE: Howl Authorization proposal
Date Wed, 13 Oct 2010 16:22:06 GMT

It's not clear to us whether, if a traditional ACL model was  
available, we would still need the HDFS model.  I suspect so, but I'm  
not sure.

We had a few concerns with the full ACL model that caused us to avoid  
it at least initially.  In this model Hive/Howl has to own all the  
files and set them to be 700.  Otherwise someone else can go  
underneath and read them via HDFS.  Maybe this is ok, but I wonder if  
it will make it harder to administer.

Our biggest concern is that HDFS already has a permissions model, why  
create a whole new one?  It is a lot of duplication.  And that  
duplication will flow through to things like logging and auditing, all  
of which Hive/Howl will now need in addition to HDFS.  To justify this  
we needed to understand what additional benefits a traditional ACL  
model would get us.  We were not able to come up with compelling use  
cases where we had to have this traditional model.

One clear issue with using HDFS is extending it to non-HDFS based  
tables (such as Hbase).  So we should work on this being an interface  
that uses the underlying security (be it HDFS or Hbase or whatever).

All that said, I see no problem with having two models for now, and  
seeing which turns out to better provide what users need and/or be  
easier to maintain.


On Oct 11, 2010, at 5:12 PM, John Sichi wrote:

> Hi Pradeep,
> Namit and I took a look at the doc; thanks for the clear writeup.
> Coincidentally, we've been starting to think about some Hive  
> authorization use cases within Facebook as well.  However, the  
> approach we're thinking about is more along the lines of traditional  
> SQL ACL's (role-based GRANT/REVOKE with persistence in the  
> metastore) rather than HDFS-based.  HIVE-78 touches on this (plus a  
> lot of unrelated stuff).
> So, one question is whether you would still need HDFS-based approach  
> if a metastore-level ACL solution were available?
> And if the answer to that is no, then would you prefer to skip the  
> HDFS-based work and just join forces on the ACL solution?
> If it turns out that you're going to need the HDFS-based approach,  
> then I can see how both can coexist (either as alternatives, or as  
> one overlayed on top of the other).  The HDFS-based approach can be  
> useful for controlling how HDFS permissions are managed in the case  
> where users are allowed direct access to HDFS, or when multiple  
> clients are used for access (which is one of the main reasons for  
> Howl to exist).
> Regarding development of the HDFS-based approach, it would make  
> sense to start off with enforcement via hooks.  I think now that we  
> have the semantic analyzer hooks, it should be possible to do it  
> either all there or via a combination of that and execution hooks.
> The code for the hook implementations can start out in Howl, and  
> then if there's consensus on adopting it within Hive, we can move it  
> at that time.
> On Oct 5, 2010, at 1:19 PM, Pradeep Kamath wrote:
>> Also, if this proposal looks reasonable, it would be nice if hive  
>> would also adopt it – so comments from hive developers/committers  
>> on the feasibility would be much appreciated!
>> Thanks,
>> Pradeep
>> From: Pradeep Kamath
>> Sent: Tuesday, October 05, 2010 1:14 PM
>> To: ''
>> Subject: Howl Authorization proposal
>> Hi,
>>    I have posted a proposal for implementing authorization in howl  
>> based on hdfs file permission at

>> . Please provide any comments/feedback on the proposal.
>> Thanks,
>> Pradeep
> __._,_.___
> Reply to sender | Reply to group | Reply via web post | Start a New  
> Topic
> Messages in this topic (3)
> 	• New Members 1
> Visit Your Group
> Switch to: Text-Only, Daily Digest • Unsubscribe • Terms of Use
> .
> __,_._,___

View raw message