hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Enis Soztutar (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-12883) Support block encoding based on knowing set of column qualifiers up front
Date Wed, 21 Jan 2015 01:03:46 GMT

    [ https://issues.apache.org/jira/browse/HBASE-12883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14284948#comment-14284948

Enis Soztutar commented on HBASE-12883:

This would be useful in other contexts as well. Even without Phoenix, I expect some users
have a predefined list of column qualifiers that changes very slowly over time.
I think we can even auto detect the column qualifiers and do dictionary encoding per block
which would make this very easy to use. We have the full block unencoded buffered up, it should
be possible to do so. Per block dictionary is good, but won't give us the full benefits of
per-file dictionary. Maybe we can have a small dictionary where we maintain a file-global
dictionary, and if the block's columns all fit there, just use that, and encode the dictionary
at the trailer of hfile.  

> Support block encoding based on knowing set of column qualifiers up front
> -------------------------------------------------------------------------
>                 Key: HBASE-12883
>                 URL: https://issues.apache.org/jira/browse/HBASE-12883
>             Project: HBase
>          Issue Type: Bug
>            Reporter: James Taylor
>              Labels: Phoenix
> Phoenix knows up front the set of column qualifiers a row will have. We could likely
get some good compression with little CPU based on this by having a block encoding scheme
that leverages this information. It could be made non-Phoenix specific by identifying the
set of column qualifiers through meta data to the block encoder.

This message was sent by Atlassian JIRA

View raw message