hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Purtell (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (HBASE-2893) Table metacolumns
Date Sat, 11 Apr 2015 01:32:16 GMT

     [ https://issues.apache.org/jira/browse/HBASE-2893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Andrew Purtell resolved HBASE-2893.
    Resolution: Later

> Table metacolumns
> -----------------
>                 Key: HBASE-2893
>                 URL: https://issues.apache.org/jira/browse/HBASE-2893
>             Project: HBase
>          Issue Type: New Feature
>          Components: Coprocessors
>            Reporter: Andrew Purtell
> Some features like TTLs or access control lists have use cases that call for per-value
> Currently in HBase TTLs are set per column family. This leads to potentially awkward
"bucketing" of values into column families set up to accommodate the common desired TTLs for
all values within -- an unnecessarily wide schema, with resulting unnecessary reduction in
I/O locality in access patterns, more store files than otherwise, and so on.
> Over in HBASE-1697 we're considering setting ACLs on column families. However, we are
aware of other BT-like systems which support per-value ACLs. This allows for multitenancy
in a single table as opposed to really requiring tables for each customer (or, at least column
families). The scale out properties for a single table are better than alternatives. I think
supporting per-row ACLs would be generally sufficient: customer ID could be part of the row
key. We can still plan to maintain column-family level ACLs. We would therefore not have to
bloat the store with per-row ACLs for the normal case -- but it would be highly useful to
support overrides for particular rows. So how to do that?
> I propose to introduce _metacolumns_. 
> A _metacolumn_ would be a column family intrinsic to every table, created by the system
at table create time.  It would be accessible like any other column family, but we expect
a default ACL that only allows access by the system and operator principals, and would function
like any other, except administrative actions such as renaming or deletion would not be allowed.
 Into the metacolumn would be stored per-row overrides for such things as ACLs and TTLs. The
metacolumn therefore would be as sparse as possible; no storage would required for any overrides
if a value is committed with defaults. A reasonably sparse metacolumn for a region may fit
entirely within blockcache. It may be possible for all metacolumns on a RS to fit within blockcache
without undue pressure on other users. We can aim design effort at this target. 
> The scope of changes required to support this is:
> - Introduce metacolumn concept in the code and into the security model (default ACL):
A flag in HCD, a default ACL, and a few additional checks for rejecting disallowed administrative
> - Automatically create metacolumns at table create time.
> - Consult metacolumn as part of processing reads or mutations, perhaps using a bloom
filter to shortcut lookups for rows with no metaentries, and apply configuration or security
policy overrides if found.

This message was sent by Atlassian JIRA

View raw message