Return-Path: X-Original-To: apmail-hbase-dev-archive@www.apache.org Delivered-To: apmail-hbase-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 436019526 for ; Sun, 17 Jun 2012 17:29:22 +0000 (UTC) Received: (qmail 38551 invoked by uid 500); 17 Jun 2012 17:29:21 -0000 Delivered-To: apmail-hbase-dev-archive@hbase.apache.org Received: (qmail 38485 invoked by uid 500); 17 Jun 2012 17:29:21 -0000 Mailing-List: contact dev-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hbase.apache.org Delivered-To: mailing list dev@hbase.apache.org Received: (qmail 38475 invoked by uid 99); 17 Jun 2012 17:29:21 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 17 Jun 2012 17:29:21 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=MIME_QP_LONG_LINE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of andrew.purtell@gmail.com designates 209.85.210.41 as permitted sender) Received: from [209.85.210.41] (HELO mail-pz0-f41.google.com) (209.85.210.41) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 17 Jun 2012 17:29:13 +0000 Received: by dakp5 with SMTP id p5so7407569dak.14 for ; Sun, 17 Jun 2012 10:28:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=subject:references:from:content-type:x-mailer:in-reply-to :message-id:date:to:content-transfer-encoding:mime-version; bh=zngVZONf4DSq6iIfS1PxSzOhlKRTN99rKPBD4CqO2+Q=; b=uzQU2Bn+L2Zo/V6jUd4kE4lXYcWJ2SdeTKG47AVGAYuADD3dJmRrXc52MFAaWbdG4J uGzePsVK5/t8TpVcXZ3gS06xvpsfU3/lZpg7Y3J13fQhz71cmdfRj3q5NPFg6bFCCNBS coguob75UPuTG3EJr63BewBo+H1SZ8uD8YGy/bkw1lKwkAfSUhj06kFjyC4isoDxZcOC 5EdwV2FLH+wli1agsXdZYb8+MrOsvWFRT4kAj+yeq84tFackzrhdJdCKiuAvuHbxp448 qQxAGvUp4Bcknxrc43+p/fQz9oAOTh16MsrgATy2IRgZ0foTr+YWO4RCCjV8+K8CHE50 W0pg== Received: by 10.68.191.72 with SMTP id gw8mr3736013pbc.143.1339954131856; Sun, 17 Jun 2012 10:28:51 -0700 (PDT) Received: from [192.168.1.14] (adsl-69-231-16-33.dsl.irvnca.pacbell.net. [69.231.16.33]) by mx.google.com with ESMTPS id wk3sm20846362pbc.21.2012.06.17.10.28.49 (version=TLSv1/SSLv3 cipher=OTHER); Sun, 17 Jun 2012 10:28:50 -0700 (PDT) Subject: Re: HBase security research References: From: Andrew Purtell Content-Type: text/plain; charset=us-ascii X-Mailer: iPhone Mail (9B206) In-Reply-To: Message-Id: Date: Sun, 17 Jun 2012 10:28:49 -0700 To: "dev@hbase.apache.org" Content-Transfer-Encoding: quoted-printable Mime-Version: 1.0 (1.0) I'd also encourage you to read HBASE-1637 and subtasks to see what has alrea= dy gone in and how it was implemented basically as Joey had suggested. If yo= u reimplement something the first question that will be asked is what part o= f HBase code can be reused I.e incremental dev is preferred where possible.=20= Your work sounds interesting and also challenging as it seems you may have t= o substantially hack the DFSClient as well as HBase.=20 - Andy On Jun 17, 2012, at 8:40 AM, Jonathan Hsieh wrote: > Hi erwinx, >=20 > Sounds interesting to me! >=20 > If your purposes are to research/a paper, I'm always a fan of spending > some time to define the problem (something constrained to 2 pages would be= > good) you are trying to solve. I find it personally helpful to myself and= > it would help us greatly if you ask us for implementation advice! After > that I'd following Joey's advice as an implementation avenue -- start > hacking using the coprocessor interface. >=20 > Does your goal also includes potential integration as part of HBase? >=20 > The threat model sketch you are assuming sounds interesting. Up to this > point, our threat model is roughly gives the attacker only the ability to > make arbitrary rpcs, the ability to sniff client traffic, but also someone= > who does not have credentials to get to the underlying hdfs file system. >=20 > There are a few related issues that may be related to what you are lookin= g > into on the bug/feature tracker. Here are some links to get started: It > would be nice to frame what you are trying to solve in relation to those. > :) >=20 > https://issues.apache.org/jira/browse/HBASE-6222 Key value visibility tags= . > https://issues.apache.org/jira/browse/HBASE-1697 DAC umbrella >=20 > Jon. >=20 > On Sun, Jun 17, 2012 at 5:29 AM, erwin x wrote: >=20 >> Hi all, >>=20 >> I am investigating how HBase can be used to store sensitive/confidential >> information. >> This research is part of my master thesis for computing science at a >> university. >>=20 >> The research involves mostly confidentiality, for example: >> - Describing the location of the data within the distributed system >> - Role based access control >> - Fine grained access control (at column/row level) >> - Build-in encryption based on the role >> - The impact on performance and validation of the above security. >>=20 >> My questions are: >>=20 >> 1) are the above features interesting for HBase? >> 2) should I propose my changes and results in the Jira of HBase? >>=20 >> This research assumes that the data is so sensitive that even >> administrators, developers or other malicious accessors may not see >> the data unless they have an authorized role. >>=20 >> If I observed correctly (correct me if I am wrong), security in HBase >> now focuses primarily on authentication and discretionary access >> control and assumes that no malicious user has access to the >> underlying system, for example HDFS, hard drive or shell access because >> data can still be read in that way. My research focuses on extending >> HBase security with more authorization and confidentiality features. >>=20 >> Thanks in advance! >>=20 >> Kind regards, >> erwinx >>=20 >=20 >=20 >=20 > --=20 > // Jonathan Hsieh (shay) > // Software Engineer, Cloudera > // jon@cloudera.com