hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-4608) HLog Compression
Date Thu, 15 Mar 2012 04:16:01 GMT

    [ https://issues.apache.org/jira/browse/HBASE-4608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13229877#comment-13229877

stack commented on HBASE-4608:

@Ted Would suggest that in future you not piecemeal in your reviews.  Bulk them up.  When
review comes in in dribs and drabs, the whole process takes way longer.

@Lars "What portion of the WAL storage do the current WALs represent?"

Do you mean, how much of our footprint is comprised of WAL logs?  Not sure.  I thought intent
of this issue was to speed syncs because there'd be less bytes to shuttle across the datanode
replicas pipeline.

I'm not wondering if this patch is worth adding?  If compressible stuff is only shrinking
by half, is that big enough win?  What do you lot thing?  LZMA is not viable because it takes
for ever compressing though its turning SU WALs into 11-14% original size.

Let me try adding lzo numbers but we wouldn't want to use lzo anyways because we could lose
a bunch of edits off the end if the compression block was not closed off (Thats my understanding.
 I could be wrong).

Li, what happens if we cut the end off a dictionary-compressed file.  Will we be able to read
up to the last byte or word or so?

Good stuff.
> HLog Compression
> ----------------
>                 Key: HBASE-4608
>                 URL: https://issues.apache.org/jira/browse/HBASE-4608
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: Li Pi
>            Assignee: stack
>             Fix For: 0.94.0
>         Attachments: 4608-v19.txt, 4608-v20.txt, 4608-v22.txt, 4608v1.txt, 4608v13.txt,
4608v13.txt, 4608v14.txt, 4608v15.txt, 4608v16.txt, 4608v17.txt, 4608v18.txt, 4608v23.txt,
4608v24.txt, 4608v25.txt, 4608v27.txt, 4608v29.txt, 4608v30.txt, 4608v5.txt, 4608v6.txt, 4608v7.txt,
4608v8fixed.txt, hbase-4608-v28-delta.txt, hbase-4608-v28.txt, hbase-4608-v28.txt
> The current bottleneck to HBase write speed is replicating the WAL appends across different
datanodes. We can speed up this process by compressing the HLog. Current plan involves using
a dictionary to compress table name, region id, cf name, and possibly other bits of repeated
data. Also, HLog format may be changed in other ways to produce a smaller HLog.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message