hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "jiraposter@reviews.apache.org (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-4608) HLog Compression
Date Tue, 15 Nov 2011 01:05:55 GMT

    [ https://issues.apache.org/jira/browse/HBASE-4608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13150139#comment-13150139
] 

jiraposter@reviews.apache.org commented on HBASE-4608:
------------------------------------------------------



bq.  On 2011-11-09 20:00:52, Nicolas Spiegelberg wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/regionserver/wal/WALEdit.java, lines 122-124
bq.  > <https://reviews.apache.org/r/2740/diff/1/?file=56624#file56624line122>
bq.  >
bq.  >     should compression be added to the HLogKey as well to compress regionName &
table?  It seems like the biggest wins will come from table + region + family, which all have
user-bounded values.  It might even make sense to have these values in a different dictionary
from row & column qualifier, which can be unbounded and might accidentally dominate the
dictionary
bq.  
bq.  Li Pi wrote:
bq.      Yes. I'm adding compression for regionname, table, and family as well. For this kind
of simple 1 way associative dictionary, its likely that those two factors will end up dominating,
but other more complex dictionaries can be used, perhaps with more interesting eviction strategies.
bq.      
bq.      I do agree using multiple dictionaries is a simple strategy.

FYI I already compress CF name along with column qualifier. Are regionname and table stored
as part of the row key?


- Li


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2740/#review3136
-----------------------------------------------------------


On 2011-11-07 23:12:37, Li Pi wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/2740/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-11-07 23:12:37)
bq.  
bq.  
bq.  Review request for hbase, Eli Collins and Todd Lipcon.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  Heres what I have so far. Things are written, and "should work". I need to rework the
test cases to test this, and put something in the config file to enable/disable. Obviously
this isn't ready for commit at the moment, but I can get those two things done pretty quickly.
bq.  
bq.  Obviously the dictionary is incredibly simple at the moment, I'll come up with something
cooler sooner. Let me know how this looks.
bq.  
bq.  
bq.  This addresses bug HBase-4608.
bq.      https://issues.apache.org/jira/browse/HBase-4608
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    src/main/java/org/apache/hadoop/hbase/KeyValue.java e68e486 
bq.    src/main/java/org/apache/hadoop/hbase/regionserver/wal/CompressedKeyValue.java PRE-CREATION

bq.    src/main/java/org/apache/hadoop/hbase/regionserver/wal/SimpleDictionary.java PRE-CREATION

bq.    src/main/java/org/apache/hadoop/hbase/regionserver/wal/WALDictionary.java PRE-CREATION

bq.    src/main/java/org/apache/hadoop/hbase/regionserver/wal/WALEdit.java e1117ef 
bq.  
bq.  Diff: https://reviews.apache.org/r/2740/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Li
bq.  
bq.


                
> HLog Compression
> ----------------
>
>                 Key: HBASE-4608
>                 URL: https://issues.apache.org/jira/browse/HBASE-4608
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: Li Pi
>            Assignee: Li Pi
>         Attachments: 4608v1.txt
>
>
> The current bottleneck to HBase write speed is replicating the WAL appends across different
datanodes. We can speed up this process by compressing the HLog. Current plan involves using
a dictionary to compress table name, region id, cf name, and possibly other bits of repeated
data. Also, HLog format may be changed in other ways to produce a smaller HLog.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message