Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 603E59D48 for ; Thu, 1 Mar 2012 08:38:29 +0000 (UTC) Received: (qmail 30205 invoked by uid 500); 1 Mar 2012 08:38:29 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 30163 invoked by uid 500); 1 Mar 2012 08:38:29 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 30155 invoked by uid 99); 1 Mar 2012 08:38:29 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 01 Mar 2012 08:38:29 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 01 Mar 2012 08:38:25 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id 07C79C8955 for ; Thu, 1 Mar 2012 08:38:05 +0000 (UTC) Date: Thu, 1 Mar 2012 08:38:05 +0000 (UTC) From: "Todd Lipcon (Commented) (JIRA)" To: issues@hbase.apache.org Message-ID: <1749703237.6625.1330591085033.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <47674994.5503.1330561318651.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Commented] (HBASE-5498) Secure Bulk Load MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HBASE-5498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13219915#comment-13219915 ] Todd Lipcon commented on HBASE-5498: ------------------------------------ Two things: 1) HBase currently has no dependencies on MR. One can submit MR jobs that write to/from HBase, but really those are just using the MR client from within a MR context. 2) A large proportion of bulk load use cases generate HFiles at the end of a pipeline which involves custom user code -- for example the map phase parses and processes a custom format, emitting keys into whatever structure is needed in HBase. Then, the reducer performs the partition/sort to write out the appropriate HFiles. Thus the jobs themselves are running user code, not just a pre-baked example like ImportTSV. The proposed solution doesn't address this use case -- it's totally unacceptable to run user code under an HBase security context in a multi-tenant environment. > Secure Bulk Load > ---------------- > > Key: HBASE-5498 > URL: https://issues.apache.org/jira/browse/HBASE-5498 > Project: HBase > Issue Type: Improvement > Reporter: Francis Liu > > Design doc: https://cwiki.apache.org/confluence/display/HCATALOG/HBase+Secure+Bulk+Load > Short summary: > Security as it stands does not cover the bulkLoadHFiles() feature. Users calling this method will bypass ACLs. Also loading is made more cumbersome in a secure setting because of hdfs privileges. bulkLoadHFiles() moves the data from user's directory to the hbase directory, which would require certain write access privileges set. > Our solution is to create a coprocessor which makes use of AuthManager to verify if a user has write access to the table. If so, launches a MR job as the hbase user to do the importing (ie rewrite from text to hfiles). One tricky part this job will have to do is impersonate the calling user when reading the input files. We can do this by expecting the user to pass an hdfs delegation token as part of the secureBulkLoad() coprocessor call and extend an inputformat to make use of that token. The output is written to a temporary directory accessible only by hbase and then bulkloadHFiles() is called. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira