hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Purtell (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-9045) Support Dictionary based Tag compression in HFiles
Date Wed, 30 Oct 2013 19:07:27 GMT

    [ https://issues.apache.org/jira/browse/HBASE-9045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13809496#comment-13809496
] 

Andrew Purtell commented on HBASE-9045:
---------------------------------------

bq. do we always need to compress tag by tag or sometimes the entire tag part can be compressed.
 In some cases compressing the entire thing would be simple and would be better for that scneario
I feel. That would imapct the WAL compresssion also then.

Interesting point. For the case where there's just one tag on the cell, it's the same, and
for cases where there are a number of cells with the exact same set of tags it would perform
better. On the other hand, if cells have many common tags but the similarities don't coincide
on any given cell then the dictionary will be inefficient compared to the per-tag approach.
Probably the per-tag approach is better for the general case.

> Support Dictionary based Tag compression in HFiles
> --------------------------------------------------
>
>                 Key: HBASE-9045
>                 URL: https://issues.apache.org/jira/browse/HBASE-9045
>             Project: HBase
>          Issue Type: Sub-task
>    Affects Versions: 0.98.0
>            Reporter: Anoop Sam John
>            Assignee: Anoop Sam John
>             Fix For: 0.98.0
>
>         Attachments: HBASE-9045.patch, HBASE-9045_V2.patch
>
>
> Along with the DataBlockEncoding algorithms, Dictionary based Tag compression can be
done



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Mime
View raw message