Mailing-List: contact dev-help@hbase.apache.org; run by ezmlm
Precedence: bulk
Reply-To: dev@hbase.apache.org
Received-SPF: pass (nike.apache.org: domain of todd@cloudera.com designates
 209.85.215.50 as permitted sender)
MIME-Version: 1.0
In-Reply-To: 
 <CA+RK=_DN9MrTRi91dvYjF1gCz-V-fQQ0CbJmAX_fcJhVB2Pq7w@mail.gmail.com>
References: <30C1634D-0D0C-47E2-9D64-E8C313CCFF07@yahoo-inc.com>
 <CA+RK=_Cp3FUxhTpTgTTfcgRtq4hAwTmo-ijOFUPx8Y4eyzzOBg@mail.gmail.com>
 <CADY20s7C4TmjnVqaPtDq8p9B78rwiU8Pe9gxkLE4FtS7F6J+Ew@mail.gmail.com>
 <CA+RK=_DN9MrTRi91dvYjF1gCz-V-fQQ0CbJmAX_fcJhVB2Pq7w@mail.gmail.com>
From: Todd Lipcon <todd@cloudera.com>
Date: Thu, 24 Apr 2014 10:21:38 -0700
Message-ID: 
 <CADY20s4KCWcUpw7ODcV9wt4g-2aOGjJpyV4cxf0sFe+hakdE5w@mail.gmail.com>
Subject: Re: Behavior change in Access Controller between 0.94 and 0.98
To: dev <dev@hbase.apache.org>
Content-Type: multipart/alternative; boundary=089e0158caf22842d704f7cd1608

--089e0158caf22842d704f7cd1608
Content-Type: text/plain; charset=ISO-8859-1

On Thu, Apr 24, 2014 at 10:13 AM, Andrew Purtell <apurtell@apache.org>wrote:

> > Does this leave us open to leaking row existence due to timing
> differences?
>
> I think I have to answer yes because we've never considered a defense
> against this kind of attack against HBase data sources ever. As you say it
> would depend on schema design. Do you think defending against timing
> attacks is something HBase should do?


In certain cases... (see below)


> Is this a feature offered by MySQL or
> Postgres or commercial RDBMSes?


AFAIK most commercial databases don't offer the "hidden visibility" access
control differently than "access denied". That is to say, you may deny
access to a table, but in that case you get an error with any access to the
table rather than an empty result.

In our case we're probably leaking table size as well -- a scan with no key
range attached should take time proportional to the amount of data in the
table, even if you have no access. In commercial DBs I would be surprised
if a user can get these types of estimates for a table they're disallowed
from.


> Or perhaps your point is more that the
> original behavior of the AccessController is better because the number of
> users able to perform this kind of attack would be limited to explicit
> grants at the table or CF level.
>

Right. If I have a multitenant system and I deny you access, I wouldn't
except you to be able to perform these kinds of attacks.

I'm a bit of an outsider (haven't followed the implementation of the
security features or why the design choices were made), but from the
"outsider" perspective, I would have guessed that the table-level ACLs
would have two different permissions: READ (default VISIBLE) and READ
(default INVISIBLE). If a user has the former, then they can see all cells
that aren't explicitly made invisible to them, and if the user has the
latter, they can't see any cells unless made explicitly visible. But if
they have neither type of READ permission on the table level, then they
shouldn't be able to access the table at all.

This will also come into play once we do a better job of safeguarding META.
AFAIK today a user can scan META and see the row keys for region boundaries
regardless of their access to those tables. This seems like the kind of
thing that you'd need to allow for a user who has READ (even if they have
default invisible), but you wouldn't want to allow for an arbitrary user on
the cluster.

-Todd


>
>
>
> On Thu, Apr 24, 2014 at 10:04 AM, Todd Lipcon <todd@cloudera.com> wrote:
>
> > Does this leave us open to leaking row existence due to timing
> differences?
> >
> > For example, imagine you had a table where I happened to know (eg from
> > reading your design docs on the wiki) that the key is made up of social
> > security numbers. If I wanted to come up with a list of valid SSNs, I
> could
> > issue GETs against your table. If I issue a GET for an invalid SSN, the
> > response will come back on average quite a bit faster than if I issued a
> > GET for a valid SSN (since the invalid SSN would be filtered by blooms
> > where the valid one would not).
> >
> > -Todd
> >
> >
> > On Thu, Apr 24, 2014 at 9:49 AM, Andrew Purtell <apurtell@apache.org>
> > wrote:
> >
> > > This is an intended change that was done as part of introducing cell
> > ACLs.
> > > Otherwise we can't support use cases where the user has no
> authorization
> > on
> > > the table or CF level but cell ACLs grant exceptional access. It also
> > > brings the AccessController behavior in line with the new
> > > VisibilityController - cells which the user are not authorized to see
> are
> > > invisible in both settings.
> > >
> > > Enis recently brought up the same issue, let me copy that here:
> > >
> > > >>>
> > >
> > > Subject: Get / Scan without table ACLs no longer throws
> > > AccessDeniedException
> > >
> > > I was a bit surprised to find out about the case where there is a
> > > behavioral change in trying to read from tables that the user do not
> have
> > > table/cf level permission.
> > > [...]
> > > Also this behavioral change is applicable to the audit log as well, we
> no
> > > longer mark the access granted / denied requests for gets and scans in
> > the
> > > audit log which is concerning.
> > >
> > > From the lsat paragraph in
> > > https://blogs.apache.org/hbase/entry/hbase_cell_security, Andrew
> states
> > > that there are two modes now, check cell first or not
> > > (Query.setACLStrategy()).
> > >
> > > However, my understanding was that the default behavior should check
> > table
> > > first, and then not do the scan at all if that is denied. From the code
> > > TableAuthManager.authorize(), it does not look to be the case. My
> > questions
> > > are:
> > >  1) This is a behavioral change, and changes the default behavior as
> well
> > > regardless of whether cell level security is used or not. Should we
> > revert
> > > back to the original behavior?
> > >  2) Even if we do not revert, should we record get / scans in the audit
> > log
> > > ?
> > >  3) Are we targeting two use cases (a) user do not have table level
> auth,
> > > but selectively have cell level access, and (b) user do have table
> level
> > > auth but selectively NOT have cell level access? For these two use
> cases,
> > > should the strategy be a table level property rather than an per-op
> > > property ?
> > >
> > > <<<
> > >
> > > To which I replied:
> > >
> > > >>>
> > >
> > > The answer is #3.
> > >
> > > It could be made a table level property.
> > >
> > > > Also this behavioral change is applicable to the audit log as well,
> we
> > no
> > > longer mark the access granted / denied requests for gets and scans in
> > the
> > >  audit log which is concerning.
> > >
> > > This is some kind of logic bug or oversight, please file a jira.
> > >
> > > <<<
> > >
> > > So if the consensus is this is too surprising or unwanted, then we can
> > > without much difficulty make the new behavior configurable on a per
> table
> > > basis and have the default be the new behavior, with a release note and
> > > paragraph in the security guide explaining how to reintroduce the old
> > > behavior. I think that covers the bases.
> > >
> > >
> > >
> > > On Thu, Apr 24, 2014 at 12:35 AM, Vandana Ayyalasomayajula <
> > > avandana@yahoo-inc.com> wrote:
> > >
> > > > Hi All,
> > > >
> > > > We have seen a behavior change in the manner AccessController blocks
> > > > unauthorized users between 0.94 and 0.98.
> > > > In 0.98, if an unauthorized user tried to perform GET, SCAN  empty
> > > results
> > > > are returned, whereas the same operations
> > > > in 0.94 used to throw access denied exceptions.
> > > >
> > > > Is this a behavior change or a bug in 0.98 ? It would be of great
> help
> > if
> > > > someone could point me to any jira which has discussions related to
> > > > these changes.
> > > >
> > > > Thanks
> > > > Vandana
> > >
> > >
> > >
> > >
> > > --
> > > Best regards,
> > >
> > >    - Andy
> > >
> > > Problems worthy of attack prove their worth by hitting back. - Piet
> Hein
> > > (via Tom White)
> > >
> >
> >
> >
> > --
> > Todd Lipcon
> > Software Engineer, Cloudera
> >
>
>
>
> --
> Best regards,
>
>    - Andy
>
> Problems worthy of attack prove their worth by hitting back. - Piet Hein
> (via Tom White)
>


-- 
Todd Lipcon
Software Engineer, Cloudera

--089e0158caf22842d704f7cd1608--