hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Enis Soztutar (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-7716) Row Groups / Row Family / Entity Groups in HBase
Date Thu, 31 Jan 2013 00:25:14 GMT

    [ https://issues.apache.org/jira/browse/HBASE-7716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13567139#comment-13567139

Enis Soztutar commented on HBASE-7716:

bq. If you are thinking of adding optional metadata to KVs, consider HBASE-7448.
Agreed. I was thinking this can be based on HBASE-7448, and/or KV v2 HBASE-7233. I have to
think some more about the key sorting with regards to row group. 
bq. What is the interplay with split policies? Split policy would be extended with the row
group concept? How would a split policy know the starting and ending boundary of a row group?
We would write a split policy, similar to KeyPrefixRegionSplitPolicy, and make it default.
That split policy will just find the split point as usual, but backward/forward roll it to
exclude/include all the rows in the current row group. This ties into the sort order of keys
with row groups. 
bq. Reminds me of "major key paths" in the Oracle NOSQL DB: http://www.oracle.com/technetwork/products/nosqldb/overview/key-value-497224.html
Thanks for the link. It is a very similar idea it seems (especially if we do row groups as
row key prefixes)
> Row Groups / Row Family / Entity Groups in HBase
> ------------------------------------------------
>                 Key: HBASE-7716
>                 URL: https://issues.apache.org/jira/browse/HBASE-7716
>             Project: HBase
>          Issue Type: New Feature
>          Components: Client, regionserver
>            Reporter: Enis Soztutar
>            Assignee: Enis Soztutar
>             Fix For: 0.98.0
>         Attachments: Entity Groups in HBase.txt
> This issue is to discuss the possible addition to the HBase data model for "Row Groups".
> As we are nearing 1.0, discussing this for 0.98 seems the right time, especially given
that we have custom region split policies, local transactions, and API overhaul around data
types -> bytes. 
> Row Groups are semantic groupings of rows in the Hbase data model. All rows within a
given row group share the same row group key. 
> Row groups are similar to column families in HBase or locality groups in BigTable, but
transposed to rows instead of columns. All the rows within a row group physically belong together,
and served by a single region. This means that region boundaries cannot split the row group.

> Row groups are not predefined, and are dynamic. There can be one row group per row. 
> Row keys are fully optional, and backwards compatible. 

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message