Return-Path: X-Original-To: apmail-hive-dev-archive@www.apache.org Delivered-To: apmail-hive-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 2073910ED0 for ; Tue, 5 May 2015 16:04:11 +0000 (UTC) Received: (qmail 93967 invoked by uid 500); 5 May 2015 16:04:10 -0000 Delivered-To: apmail-hive-dev-archive@hive.apache.org Received: (qmail 93885 invoked by uid 500); 5 May 2015 16:04:10 -0000 Mailing-List: contact dev-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hive.apache.org Delivered-To: mailing list dev@hive.apache.org Received: (qmail 93872 invoked by uid 99); 5 May 2015 16:04:10 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 05 May 2015 16:04:10 +0000 X-ASF-Spam-Status: No, hits=3.2 required=5.0 tests=FREEMAIL_REPLY,HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: message received from 54.191.145.13 which is an MX secondary for dev@hive.apache.org) Received: from [54.191.145.13] (HELO mx1-us-west.apache.org) (54.191.145.13) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 05 May 2015 16:04:01 +0000 Received: from mail-pa0-f42.google.com (mail-pa0-f42.google.com [209.85.220.42]) by mx1-us-west.apache.org (ASF Mail Server at mx1-us-west.apache.org) with ESMTPS id E1A8124E3A for ; Tue, 5 May 2015 16:03:41 +0000 (UTC) Received: by pacyx8 with SMTP id yx8so197355793pac.1 for ; Tue, 05 May 2015 09:03:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=message-id:date:from:user-agent:mime-version:to:subject:references :in-reply-to:content-type; bh=mpSjtHimKhdnzouhACuIp5/BP0YbDOH+chSsjkGSSj8=; b=z7RvP+bRX48ndWnDcQGinDEONp29QSKLJpMenkBC6x71h7h427EmZIdpo25jIY9yOb v1JvOqVhuWm0hVPBajfEL1lwLm9yvan1V5QcbEEsPLLqb2ZdeijHWVgFX3CJlfWnIdJb Ftzh3h9Wuc/Dn9YY3dEg0FarBb7AqXoUViYl+PpXrGDLygcKuw2Cf+2b6mJ6D1IBkaRJ +WQvGm/QGfMJDlnm2f6/d78xEddKWcNwPwOpfzQ7SIGYppKPqpZK51unf0xKLCFVLcQ0 U7xez6rmx/bIpumUrhkNbM/RV+h8svdfYlMMiHwxzgkGZ6DIMlrRNenjVAyE8cjKP6sE tjTA== X-Received: by 10.69.25.41 with SMTP id in9mr52559996pbd.80.1430841732524; Tue, 05 May 2015 09:02:12 -0700 (PDT) Received: from Alan-Gatess-MacBook-Pro.local (c-76-103-170-145.hsd1.ca.comcast.net. [76.103.170.145]) by mx.google.com with ESMTPSA id td3sm16513205pab.46.2015.05.05.09.02.10 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Tue, 05 May 2015 09:02:11 -0700 (PDT) Message-ID: <5548E984.1070403@gmail.com> Date: Tue, 05 May 2015 09:02:12 -0700 From: Alan Gates User-Agent: Postbox 3.0.11 (Macintosh/20140602) MIME-Version: 1.0 To: dev@hive.apache.org Subject: Re: Why SQL Standards Based Authorization is implemented on HiveServer2 side only? References: In-Reply-To: Content-Type: multipart/alternative; boundary="------------030308010508040108090301" X-Virus-Checked: Checked by ClamAV on apache.org --------------030308010508040108090301 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit The issue is that security checks are done in the client. In order to fully do them in the metastore we would have been forced to move significant amounts of functionality out of the client and into the metastore. Query parsing and planning would have had to be moved to the metastore, basically making it HS2. A few more comments inline: > Sergey Tryuber > May 4, 2015 at 12:56 > Hi Guys, > > My understanding is that there are two safe ways of usage of SQL Standards > Based Authorization (SSBA): > > 1. Hide Hive Metastore from the world by embedding it into HiveServer2. > *MetaStoreAuthzAPIAuthorizerEmbedOnly* configuration for Metastore is > only a half-protection since everyone can change tables-specific metadata. > 2. Have "two Metastores", but Public one should be additionally > protected by Storage Based Authorization > > Option #2 is much more demanded, since there are too many frameworks in > Hadoop ecosystem which use Hive Metastore. But necessity to keep both SQL > and HDFS ACLs in sync is an administration nightmare (especially taking > into account that "doAs" option is false in SSBA mode). > > *Why isn't it possible to add SSBA-like authorizer to Hive Metastore as > well?* The authorizer could check if a user has permissions to update > table-specific metadata according to his role and username. I could even > imagine following layout: > > 1. All the files in Hive tables can be accessed only by few system users > (hive, spark-sql, impala, etc) How would you accomplish this? Hive's files are stored in HDFS and thus must work with HDFS file permissions. You could construct a group that contained those users and make the files accessible to that group, but each cluster admin would have to do that. > 2. There is only a single place of granting permissions - through SQL > standards and all SQL-like frameworks around the metastore should use it We can't break backwards compatibility, so we could make this an option but we couldn't enforce it. > 3. Additional HDFS permissions configuration would be needed only for > rare cases of data access from non-impersonated execution pipelines (Spark > Core, etc) > 4. No necessity to have embedded into HiveServer2 metastore, no strange > configuration options, easier for understanding and documentation > > May be I've missed something in my understanding... So, please, point > me to > my mistake in this case. > --------------030308010508040108090301 Content-Type: multipart/related; boundary="------------050008050301020304040504" --------------050008050301020304040504 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: 8bit The issue is that security checks are done in the client.  In order to fully do them in the metastore we would have been forced to move significant amounts of functionality out of the client and into the metastore.  Query parsing and planning would have had to be moved to the metastore, basically making it HS2.  A few more comments inline:
May 4, 2015 at 12:56
Hi Guys,

My understanding is that there are two safe ways of usage of SQL Standards
Based Authorization (SSBA):

1. Hide Hive Metastore from the world by embedding it into HiveServer2.
*MetaStoreAuthzAPIAuthorizerEmbedOnly* configuration for Metastore is
only a half-protection since everyone can change tables-specific metadata.
2. Have "two Metastores", but Public one should be additionally
protected by Storage Based Authorization

Option #2 is much more demanded, since there are too many frameworks in
Hadoop ecosystem which use Hive Metastore. But necessity to keep both SQL
and HDFS ACLs in sync is an administration nightmare (especially taking
into account that "doAs" option is false in SSBA mode).

*Why isn't it possible to add SSBA-like authorizer to Hive Metastore as
well?* The authorizer could check if a user has permissions to update
table-specific metadata according to his role and username. I could even
imagine following layout:

1. All the files in Hive tables can be accessed only by few system users
(hive, spark-sql, impala, etc)
How would you accomplish this?  Hive's files are stored in HDFS and thus must work with HDFS file permissions.  You could construct a group that contained those users and make the files accessible to that group, but each cluster admin would have to do that.
2. There is only a single place of granting permissions - through SQL
standards and all SQL-like frameworks around the metastore should use it
We can't break backwards compatibility, so we could make this an option but we couldn't enforce it.
3. Additional HDFS permissions configuration would be needed only for
rare cases of data access from non-impersonated execution pipelines (Spark
Core, etc)
4. No necessity to have embedded into HiveServer2 metastore, no strange
configuration options, easier for understanding and documentation

May be I've missed something in my understanding... So, please, point me to
my mistake in this case.

--------------050008050301020304040504 Content-Type: image/jpeg; x-apple-mail-type=stationery; name="compose-unknown-contact.jpg" Content-Transfer-Encoding: base64 Content-ID: Content-Disposition: inline; filename="compose-unknown-contact.jpg" /9j/4AAQSkZJRgABAQEARwBHAAD/2wBDAAEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEC AQEBAQEBAgICAgICAgICAgICAgICAgICAgICAgICAgICAgL/2wBDAQEBAQEBAQICAgICAgIC AgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgL/wAAR CAAZABkDAREAAhEBAxEB/8QAGAAAAwEBAAAAAAAAAAAAAAAABgcICQr/xAA0EAABAwMCAgUK BwAAAAAAAAACAQMEBQYRABITIQcUMUF2CBUXIjI2N0JRtVRWkZOV0dL/xAAYAQEAAwEAAAAA AAAAAAAAAAADAAEEAv/EACQRAAICAAQGAwAAAAAAAAAAAAABAhEDMrHREyExM0FxgfDx/9oA DAMBAAIRAxEAPwDuEt+gW/ULet6oVC3rfqNQqFv0OfPn1GhUqfOmzZtKZlS5UqZMaNwzNwiJ VIl7eXLCaZIGwBl3TY8epPx2+jy2ZNPjvkwc9uhW8j7nCPhvOsQliYIeS7cvCpp8o50qwrC4 v3lsNSDbdmTEhvs2tahxpfV3WnmbbozJEw/gwdadbYExVRXKEKoSdvJcaOSqxE7/AAiX0gXx +a69/JSf9alIlste0VzaNpeFrcT9KKymotyiaZ0KRCnzacoE7Kjzn4gi2KqUh3jqDHDHv4mR UfruTWlMzlVUKIVNp9GguEJnAh0+IZjyAiisgyRDnu5azS8miKqjOTVkKqS/psG37fo1Fbab eg25b8eZPeFJBBJSjMG5HjMeyihnaauZwe4OGiju13GAcpOwBeN+U8/IkGbsiS8b7ryogmbz hbyc9REROfZhERO5ETShjPtvpGqTUyLErytS4siSwx5x2tRH4hPOI0DkjZtaJtFxuVEbIUUi yeNujlBUJGbJN6nM/Cyf2Hf60YgjvKA+NPSP4gT7axpcPtr51YWJnYn9dnAQWl722p4ot37y zqnlfp6FrqbwawG8/9k= --------------050008050301020304040504-- --------------030308010508040108090301--