hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-4608) HLog Compression
Date Fri, 09 Mar 2012 21:21:00 GMT

    [ https://issues.apache.org/jira/browse/HBASE-4608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13226477#comment-13226477

stack commented on HBASE-4608:

The TestLRUDictionary test looks like it could be fatter.  Looks like you should be able to
throw at it a bunch more combinations.  And better excercising of new BidirectionalLRUMap
 type.  Better to find the issues here in unit test than....

Whats the difference between

+  public static int hashBytes(byte[] bytes, int offset, int length) {

and the existing

  public static int hashCode(final byte [] b, final int length) {

They look to do the same thing?  We should remove the new one if so.

We will have a keycontext when we are deserializing?  Hows that work?

So we compress at the individual entry level?  Why not file at a time? (Sorry if this has
been explained earlier)

Is this right in the WALReader?

+    compression = conf.getBoolean(HConstants.ENABLE_WAL_COMPRESSION, false);

How does that work if the WAL was written compressed but this flag is false?  We break?  Shouldn't
this instead be keyed off the entries themselves?  Should it be a sequence file attribute
saying this a compressed file?

Do we foresee replication being able to use this facility?  Seems like a natural having it
ship compressed entries.

Good stuff.
> HLog Compression
> ----------------
>                 Key: HBASE-4608
>                 URL: https://issues.apache.org/jira/browse/HBASE-4608
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: Li Pi
>            Assignee: Li Pi
>             Fix For: 0.94.0
>         Attachments: 4608-v19.txt, 4608v1.txt, 4608v13.txt, 4608v13.txt, 4608v14.txt,
4608v15.txt, 4608v16.txt, 4608v17.txt, 4608v18.txt, 4608v5.txt, 4608v6.txt, 4608v7.txt, 4608v8fixed.txt
> The current bottleneck to HBase write speed is replicating the WAL appends across different
datanodes. We can speed up this process by compressing the HLog. Current plan involves using
a dictionary to compress table name, region id, cf name, and possibly other bits of repeated
data. Also, HLog format may be changed in other ways to produce a smaller HLog.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message