hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "jiraposter@reviews.apache.org (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-4608) HLog Compression
Date Fri, 23 Dec 2011 06:01:37 GMT

    [ https://issues.apache.org/jira/browse/HBASE-4608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13175296#comment-13175296

jiraposter@reviews.apache.org commented on HBASE-4608:

This is an automatically generated e-mail. To reply, visit:

(Updated 2011-12-23 06:00:24.065183)

Review request for hbase, Eli Collins and Todd Lipcon.


Some new things, for WALCompress.

I've modified TestWALReplay to test compression - this is a quick hack to have effective test
cases. I'm building my own subset later. 

Integration is done, including config, but it doesn't all work yet. It worked before I tried
compressing HLogKeys, SequenceFile seems to try to read them out of order, causing it to hit
empty dictionary entries. Not sure what to do about this, any advice?

If you only compress KeyValues/WALEdits, it works fine.


Heres what I have so far. Things are written, and "should work". I need to rework the test
cases to test this, and put something in the config file to enable/disable. Obviously this
isn't ready for commit at the moment, but I can get those two things done pretty quickly.

Obviously the dictionary is incredibly simple at the moment, I'll come up with something cooler
sooner. Let me know how this looks.

This addresses bug HBase-4608.

Diffs (updated)

  src/main/java/org/apache/hadoop/hbase/regionserver/wal/CompressedKeyValue.java PRE-CREATION

  src/main/java/org/apache/hadoop/hbase/regionserver/wal/Compressor.java PRE-CREATION 
  src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java 24407af 
  src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogKey.java f067221 
  src/main/java/org/apache/hadoop/hbase/regionserver/wal/SequenceFileLogReader.java d9cd6de

  src/main/java/org/apache/hadoop/hbase/regionserver/wal/SequenceFileLogWriter.java cbef70f

  src/main/java/org/apache/hadoop/hbase/regionserver/wal/SimpleDictionary.java PRE-CREATION

  src/main/java/org/apache/hadoop/hbase/regionserver/wal/WALDictionary.java PRE-CREATION 
  src/main/java/org/apache/hadoop/hbase/regionserver/wal/WALEdit.java e1117ef 
  src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALReplay.java 59910bf 

Diff: https://reviews.apache.org/r/2740/diff




> HLog Compression
> ----------------
>                 Key: HBASE-4608
>                 URL: https://issues.apache.org/jira/browse/HBASE-4608
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: Li Pi
>            Assignee: Li Pi
>         Attachments: 4608v1.txt
> The current bottleneck to HBase write speed is replicating the WAL appends across different
datanodes. We can speed up this process by compressing the HLog. Current plan involves using
a dictionary to compress table name, region id, cf name, and possibly other bits of repeated
data. Also, HLog format may be changed in other ways to produce a smaller HLog.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message