Return-Path: X-Original-To: apmail-accumulo-dev-archive@www.apache.org Delivered-To: apmail-accumulo-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 602F518B4A for ; Thu, 5 Nov 2015 16:43:32 +0000 (UTC) Received: (qmail 26970 invoked by uid 500); 5 Nov 2015 16:43:32 -0000 Delivered-To: apmail-accumulo-dev-archive@accumulo.apache.org Received: (qmail 26934 invoked by uid 500); 5 Nov 2015 16:43:32 -0000 Mailing-List: contact dev-help@accumulo.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@accumulo.apache.org Delivered-To: mailing list dev@accumulo.apache.org Received: (qmail 26921 invoked by uid 99); 5 Nov 2015 16:43:31 -0000 Received: from Unknown (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 05 Nov 2015 16:43:31 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 76D21180A58 for ; Thu, 5 Nov 2015 16:43:31 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.88 X-Spam-Level: ** X-Spam-Status: No, score=2.88 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=3, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=disabled Authentication-Results: spamd3-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-eu-west.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id Ed_-vYIf8Gcp for ; Thu, 5 Nov 2015 16:43:26 +0000 (UTC) Received: from mail-yk0-f169.google.com (mail-yk0-f169.google.com [209.85.160.169]) by mx1-eu-west.apache.org (ASF Mail Server at mx1-eu-west.apache.org) with ESMTPS id 1581F20ED5 for ; Thu, 5 Nov 2015 16:43:25 +0000 (UTC) Received: by ykdr3 with SMTP id r3so140617525ykd.1 for ; Thu, 05 Nov 2015 08:43:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=oSis8yC3qsxh54LPnwnnth/tGLui0zspnMp1d0kkVHo=; b=jULG4j1cNbFBIOK4UJxvIiO8u8bClSctjgCYrTkrLcf+rQ6sSYcDyVHkMDHiQA/Lsm ZtuVZIwyYkw0mDCUgqoECXogOeaIzwaiBAm1+Sg9pkWJUHLiHwdyMPiVR674FRXPpoTI TynpINz+2a1aO5VauxSZK/q7qtUHYJOYaWNwhs9Gpi0eyo0yR/gkTq+ABCE0Y7bt8+uQ AVW83bTYGTVQWwjzo4jo2h5Q/CFk3E/P9OXhWX+MAh9/ohAiU2ttLSpO5FnXVrgcoNPN SSFMcv5lV6p9FeAAgIYLoCnJpcsECXmVAywx84KDiCRwL+bQkjShfER46s59Jz3XZA7p 8Fsg== MIME-Version: 1.0 X-Received: by 10.129.92.215 with SMTP id q206mr8061609ywb.1.1446741797814; Thu, 05 Nov 2015 08:43:17 -0800 (PST) Received: by 10.129.157.141 with HTTP; Thu, 5 Nov 2015 08:43:17 -0800 (PST) In-Reply-To: References: Date: Thu, 5 Nov 2015 11:43:17 -0500 Message-ID: Subject: Re: [DISCUSS] What to do about encryption at rest? From: William Slacum To: dev Content-Type: multipart/alternative; boundary=001a114d85cef0cc310523cdd22c --001a114d85cef0cc310523cdd22c Content-Type: text/plain; charset=UTF-8 Yup, #2. I also don't know if it's worth the effort for that specific feature. It might be easier to add something like per-namespace and/or per-table encryption, then define common access patterns for applications that want to use multiple keys for encryption. On Wed, Nov 4, 2015 at 8:10 PM, Adam Fuchs wrote: > Bill, > > Do you envision one of the following as the driver behind finer-grained > encryption?: > > 1. We would only encrypt certain columns in order to get better > performance; > > 2. We would use different keys on different columns in order to revoke > access to a column via the key store; > > 3. We would only give a tablet server access to a subset of columns at any > given time in order to protect something, and figure out what to do for > compactions, etc.; > > 4. Something entirely different... > > Seems like thing #2 might have merit, but I'm not sure it's worth the > effort. > > Adam > On Nov 4, 2015 7:38 PM, "William Slacum" wrote: > > > @Adam, column family level encryption can be useful for multi-tenant > > environments, and I think it maps pretty well to the document > > partitioning/sharding/wikisearch style tables. Things are trickier in > > Accumulo than in HBase since there isn't a 1:1 mapping between column > > families and files. The built in RFile encryption scheme seems better > > suited to this. > > > > @Christopher & Keith, it's something we can evaluate. Is there a good > test > > harness for just writing an RFile, opening a reader to it, and just > poking > > around? I was looking at the constructors and they didn't seem > > straightforward enough for me to comprehend them within a few seconds. > > > > > > > > On Tue, Nov 3, 2015 at 9:56 PM, Keith Turner > > wrote: > > > > > On Mon, Nov 2, 2015 at 1:37 PM, Keith Turner > > > wrote: > > > > > > > > > > > > > > > On Mon, Nov 2, 2015 at 12:27 PM, William Slacum > > > wrote: > > > > > > > >> Is "the code being 'at rest'" you making a funny about active > > > development? > > > >> Making sure I haven't lost my ability to get jokes :) > > > >> > > > >> I see two reasons why the code would be inactive: the feature is > good > > > >> enough as is or it's not interesting enough to attract attention. > > > >> Considering it's not public API, there are no discussions to bring > > into > > > >> the > > > >> public API, and there's no effort to document how to use it, my > > > intuition > > > >> tells me that there isn't enough interest in it from a project > > > >> perspective. > > > >> > > > >> From a user perspective, I've been getting asked about it when I > work > > > with > > > >> Accumulo users. My recommendation, exclusively, is to use HDFS > > > encryption > > > >> because I can go to Hadoop's website and find documentation on it. > > When > > > I > > > >> go to find documentation on Accumulo's offerings, any usability > > > >> information > > > >> comes from vendor SlideShares. Most mentions of the feature on > > official > > > >> Apache Accumulo channels echo Christopher's sentiments on the > feature > > > >> being > > > >> experimental and not being officially recommended for use. > > > >> > > > >> I wouldn't want to rip out the feature first and then figure things > > out > > > >> later. Sean already alluded to it, but a roadmap should contain > > > something > > > >> (tool or documentation) to help users migrate if we go down that > > route. > > > >> > > > >> What I'm trying to figure out is, when the question of "How do I do > > > >> encryption at rest in Accumulo?" comes up, what is our community's > > > answer? > > > >> > > > >> If we went down the route of using HDFS encryption zones, can we > offer > > > the > > > >> same features? At the very least, we'd be offering the same > > > database-level > > > >> > > > > > > > > Where does the decryption happen with DFS, is it in the DFS client? > If > > > > so, using HDFS level encryption seems to offer the same > > functionality??? > > > > > > > > Has anyone written a tool that takes an > > > > Accumulo-encrypted-HDFS-unencrypted-RFile and rewrites it is as an > > > > Accumulo-unencrypted-HDFS-encrypted-RFile? Wondering if there are > any > > > > unexpected gotchas w/ this. > > > > > > > > > > I was discussing my questions w/ Christopher today and he mentioned an > > > experiment that I thought was interesting. What is the random seek > > > performance of Accumulo-encrypted-HDFS-unencrypted-RFile vs > > > Accumulo-unencrypted-HDFS-encrypted-RFile? > > > > > > > > > > > > > > > > > > > > > >> encryption scheme. I don't know the details of "more advanced key > > > stores", > > > >> but it seems like we could potentially take any custom > implementation > > > and > > > >> map it to a KeyProvider [1]. I could also envision table level > > > encryption > > > >> being implementable via zones, but probably not down to the column > > > family > > > >> level. > > > >> > > > >> [1] > > > >> > > > >> > > > > > > https://hadoop.apache.org/docs/r2.6.0/api/org/apache/hadoop/crypto/key/KeyProvider.html > > > >> > > > >> > > > >> On Sun, Nov 1, 2015 at 10:19 AM, Adam Fuchs > > > wrote: > > > >> > > > >> > Responses inline. > > > >> > > > > >> > Adam > > > >> > > > > >> > On Nov 1, 2015 9:58 AM, "Christopher" > > > wrote: > > > >> > > > > > >> > > 1. I'm not sure I'd call an incomplete solution 'great'. What it > > > does > > > >> is > > > >> > > provide partial encryption-at-rest protection (unless you're > > running > > > >> > > without walogs, and have good integration with some external > > secure > > > >> key > > > >> > > management faculty, and then it's probably fine). > > > >> > > > > >> > The only thing that doesn't get encrypted is a temporary WAL > > recovery > > > >> file. > > > >> > That is a project we should take on, but it does not imply that > the > > > >> > existing features are not valuable. With HDFS encryption options > > this > > > >> would > > > >> > now be a much easier project to take on. Also, the users I know > that > > > use > > > >> > encryption at rest do so with a more secure key store than the > > > default. > > > >> > > > > >> > > > > > >> > > 2. I'm concerned that anybody using Accumulo's E-A-R don't > > > necessarily > > > >> > > realize its current shortcomings, or its lack of upstream > > > maintenance > > > >> > > support (which it has not been receiving). It may be the case > that > > > >> these > > > >> > > users have support from an intermediary, and do understand the > > > >> > > shortcomings... I don't know, but it's a concern. > > > >> > > > > >> > Anybody that creates a secure system has to analyze the security > of > > > the > > > >> > system as a whole. Accumulo's encryption at rest is one part of > the > > > >> > solution. Taking away the tool without providing an alternative > does > > > >> > nothing to improve the security of systems built on Accumulo. > > > >> > > > > >> > > > > > >> > > 3. Correction: it has been an explicitly experimental feature > and > > an > > > >> > > incomplete one, which hasn't really been touched in two years, > and > > > has > > > >> > been > > > >> > > explicitly excluded by the community for being public API > because > > of > > > >> its > > > >> > > incompleteness. Age doesn't determine public API status. The > > > community > > > >> > does. > > > >> > > > > >> > People are using it, so we have to consider the implications of > > > whatever > > > >> > changes we make and weigh against the benefits. I believe the last > > bug > > > >> fix > > > >> > was done this year, so I would argue it is being maintained. > Changes > > > to > > > >> our > > > >> > encryption at rest implementation will have consequences for those > > > >> users. > > > >> > There had better be a clear benefit if we break their systems. > > > >> > > > > >> > > > > > >> > > 4. Has Accumulo's been evaluated for security and performance? > By > > > >> whom? > > > >> > Is > > > >> > > it published? > > > >> > > > > >> > Yes, there have been several talks at meetups and conferences that > > > >> discuss > > > >> > the security and performance of the current solution. > > > >> > > > > >> > > > > > >> > > On Sun, Nov 1, 2015, 08:55 Adam Fuchs > > > wrote: > > > >> > > > > > >> > > > There's another way to look at the state of Accumulo's > > encryption > > > at > > > >> > rest: > > > >> > > > 1. Encryption at rest works great for what it does, and the > code > > > >> being > > > >> > "at > > > >> > > > rest" isn't necessarily a problem > > > >> > > > 2. Several organizations are using Accumulo's encryption at > rest > > > >> > > > effectively in operations > > > >> > > > 3. Encryption at rest has been a supported configuration > option > > > for > > > >> > over > > > >> > > > two years with established plugin interfaces, and therefore it > > > >> should > > > >> > be > > > >> > > > considered part of the public API > > > >> > > > 4. Upstream alternatives (to my knowledge) have not been > > analyzed > > > >> for > > > >> > > > performance or security > > > >> > > > > > > >> > > > The given option #2 would at least require an analysis of > > > >> alternatives, > > > >> > and > > > >> > > > we would have to decide what to do about backwards > compatibility > > > for > > > >> > users > > > >> > > > using custom key stores and encryption strategies that may or > > may > > > >> not > > > >> > be > > > >> > > > supported by upstream alternatives. > > > >> > > > > > > >> > > > As far as option #1 goes, I can get behind encouraging people > to > > > >> take > > > >> > up > > > >> > > > projects to improve Accumulo's encryption. I think we're > already > > > >> going > > > >> > down > > > >> > > > this path, but without having identified resources to do the > > > >> > improvements. > > > >> > > > Any volunteers? > > > >> > > > > > > >> > > > Adam > > > >> > > > > > > >> > > > > > > >> > > > On Fri, Oct 30, 2015 at 4:22 PM, William Slacum < > > > wslacum@gmail.com > > > > >> > wrote: > > > >> > > > > > > >> > > > > So I've been looking into options for providing encryption > at > > > >> rest, > > > >> > and > > > >> > > > it > > > >> > > > > seems like what Accumulo has is abandonware from a project > > > >> > perspective. > > > >> > > > > There is no official documentation on how to perform > > encryption > > > at > > > >> > rest, > > > >> > > > > and the best information from its status comes from year (or > > > >> greater) > > > >> > old > > > >> > > > > ticket comments about how the feature is still experimental. > > > >> Recently > > > >> > > > there > > > >> > > > > was a talk that described using HDFS encryption zones as an > > > >> > alternative. > > > >> > > > > > > > >> > > > > From my perspective, this is what I see as the current > > > situation: > > > >> > > > > > > > >> > > > > 1- Encryption at rest in Accumulo isn't actively being > worked > > on > > > >> > > > > 2- Encryption at rest in Accumulo isn't part of the public > API > > > or > > > >> > > > marketed > > > >> > > > > capabilities > > > >> > > > > 3- Documentation for what does exist is scattered throughout > > > Jira > > > >> > > > comments > > > >> > > > > or presentations > > > >> > > > > 4- A viable alternative exists that appears to have feature > > > >> parity in > > > >> > > > HDFS > > > >> > > > > encryption > > > >> > > > > 5- HBase has finer grained encryption capabilities that > extend > > > >> beyond > > > >> > > > what > > > >> > > > > HDFS provides > > > >> > > > > > > > >> > > > > Moving forward, what's the consensus for supporting this > > > feature? > > > >> > > > > Personally, I see two options: > > > >> > > > > > > > >> > > > > 1- Start going down a path to bring the feature into the > > > forefront > > > >> > and > > > >> > > > > start providing feature parity with HBase > > > >> > > > > > > > >> > > > > or > > > >> > > > > > > > >> > > > > 2- Remove the feature and place emphasis on upstream > > encryption > > > >> > offerings > > > >> > > > > > > > >> > > > > Any input is welcomed & appreciated! > > > >> > > > > > > > >> > > > > > > >> > > > > >> > > > > > > > > > > > > > > --001a114d85cef0cc310523cdd22c--