hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Li Pi (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-4608) HLog Compression
Date Tue, 13 Mar 2012 04:37:47 GMT

    [ https://issues.apache.org/jira/browse/HBASE-4608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13228194#comment-13228194

Li Pi commented on HBASE-4608:

Yo, sorry I can't quite work on this. Finals are finished this week, and once that happens,
I'll be able to scram.

There doesn't seem to that much left - though I said that about 3 months ago. My bad! Feel
free to do as you please, theres not much left on this, and I'm happy that work is getting
done. I won't be offended at all if somebody else wants to take their hand at finishing this.

My thoughts on it were this. WAL_VERSION is used to indicate compression type. This is pretty
good, because enabling compression would immediately tell older versions that the version
was wrong, while newer versions with compression disabled could function alongside older versions
without support for compression. 

Also, I had my old benchmarks, and I was getting anywhere from a 20% increase to 40% increase
on YCSB loads, depending on the testcase. This seemed pretty impressive to me. Not sure if
a bug was introduced. I'll run a few more benchmarks later.
> HLog Compression
> ----------------
>                 Key: HBASE-4608
>                 URL: https://issues.apache.org/jira/browse/HBASE-4608
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: Li Pi
>            Assignee: stack
>             Fix For: 0.94.0
>         Attachments: 4608-v19.txt, 4608-v20.txt, 4608-v22.txt, 4608v1.txt, 4608v13.txt,
4608v13.txt, 4608v14.txt, 4608v15.txt, 4608v16.txt, 4608v17.txt, 4608v18.txt, 4608v5.txt,
4608v6.txt, 4608v7.txt, 4608v8fixed.txt
> The current bottleneck to HBase write speed is replicating the WAL appends across different
datanodes. We can speed up this process by compressing the HLog. Current plan involves using
a dictionary to compress table name, region id, cf name, and possibly other bits of repeated
data. Also, HLog format may be changed in other ways to produce a smaller HLog.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message