hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "jiraposter@reviews.apache.org (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-4608) HLog Compression
Date Wed, 14 Mar 2012 22:22:50 GMT

    [ https://issues.apache.org/jira/browse/HBASE-4608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13229697#comment-13229697
] 

jiraposter@reviews.apache.org commented on HBASE-4608:
------------------------------------------------------



bq.  On 2012-03-14 17:42:21, Lars Hofhansl wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/regionserver/wal/Compressor.java, line 107
bq.  > <https://reviews.apache.org/r/4328/diff/2/?file=92102#file92102line107>
bq.  >
bq.  >     Nit: Comment here that the status byte is the higher order byte of the dict
entry.

done in next version


bq.  On 2012-03-14 17:42:21, Lars Hofhansl wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/regionserver/wal/Compressor.java, line 108
bq.  > <https://reviews.apache.org/r/4328/diff/2/?file=92102#file92102line108>
bq.  >
bq.  >     I assume we're entirely sure that a dictionary will never have > 2^15 entries.
bq.  
bq.  Li Pi wrote:
bq.      It'll start evicting once it hits its max size, which is currently 2 ^ 15.

Added comment to LRUDictionary on what happens when it hits limit as well as a comment on
max expected size of dictionary for any one WAL.


bq.  On 2012-03-14 17:42:21, Lars Hofhansl wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/regionserver/wal/Compressor.java, line 128
bq.  > <https://reviews.apache.org/r/4328/diff/2/?file=92102#file92102line128>
bq.  >
bq.  >     Nit: The naming convention is a bit strange.
bq.  >     This one is called uncompress... whereas the method returning a new byte[] is
called readCompressed

Its not the worst.  Its descriptive I think.


bq.  On 2012-03-14 17:42:21, Lars Hofhansl wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java, line 1678
bq.  > <https://reviews.apache.org/r/4328/diff/2/?file=92104#file92104line1678>
bq.  >
bq.  >     Have a constructor that takes a compression context too?
bq.  >     It seems like once anything has been written to the HLog this should be immutable.

That won't work for writing case since WAL compression is internal to wal package and the
HLog.Entry used writing is made outside of the HLog... which means, for writing case we need
above method.  Might work for read side though here we allow 'reuse' of the shell HLog.Entry
so would need the above method read side too.... 


bq.  On 2012-03-14 17:42:21, Lars Hofhansl wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogKey.java, line 53
bq.  > <https://reviews.apache.org/r/4328/diff/2/?file=92105#file92105line53>
bq.  >
bq.  >     COMPRESSED is a bit of a strange name.
bq.  >     I happens to be a version of the WAL that supports compression, but it is not
necessarily compressed.

Added comment that these enum means 'The WAL version that first had compression'


bq.  On 2012-03-14 17:42:21, Lars Hofhansl wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogKey.java, line 303
bq.  > <https://reviews.apache.org/r/4328/diff/2/?file=92105#file92105line303>
bq.  >
bq.  >     ugly whitespace :)

Fixed in next version.


bq.  On 2012-03-14 17:42:21, Lars Hofhansl wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/regionserver/wal/LRUDictionary.java, line
32
bq.  > <https://reviews.apache.org/r/4328/diff/2/?file=92107#file92107line32>
bq.  >
bq.  >     I think I had that question to Li Pi... How much memory do we expect this dictionary
to take worst case?
bq.  >     I guess since there is one WAL per region server and it is rolled periodically
it is not a problem at all.
bq.  
bq.  Li Pi wrote:
bq.      65536 * 5 ( Regionname, Row key, CF, Column qual, table) * 100 bytes (these are some
big names) = 32768000 bytes. Or 32 megabytes.
bq.      
bq.      If you want to get silly, even at 1kb entries (wtf are you naming things?), it maxes
out at 320 megabytes.
bq.  
bq.  Li Pi wrote:
bq.      Actually halve those amounts, 2^15, not 2^16.

Added above as class comment on class.


- Michael


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4328/#review5951
-----------------------------------------------------------


On 2012-03-14 07:34:58, Michael Stack wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/4328/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2012-03-14 07:34:58)
bq.  
bq.  
bq.  Review request for hbase.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  See issue
bq.  
bq.  
bq.  This addresses bug hbase-4608.
bq.      https://issues.apache.org/jira/browse/hbase-4608
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    src/main/java/org/apache/hadoop/hbase/HConstants.java 045c6f3 
bq.    src/main/java/org/apache/hadoop/hbase/regionserver/wal/CompressionContext.java PRE-CREATION

bq.    src/main/java/org/apache/hadoop/hbase/regionserver/wal/Compressor.java PRE-CREATION

bq.    src/main/java/org/apache/hadoop/hbase/regionserver/wal/Dictionary.java PRE-CREATION

bq.    src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java b5049b1 
bq.    src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogKey.java 311ea1b 
bq.    src/main/java/org/apache/hadoop/hbase/regionserver/wal/KeyValueCompression.java PRE-CREATION

bq.    src/main/java/org/apache/hadoop/hbase/regionserver/wal/LRUDictionary.java PRE-CREATION

bq.    src/main/java/org/apache/hadoop/hbase/regionserver/wal/SequenceFileLogReader.java ff63a5f

bq.    src/main/java/org/apache/hadoop/hbase/regionserver/wal/SequenceFileLogWriter.java 01ebb5c

bq.    src/main/java/org/apache/hadoop/hbase/regionserver/wal/WALEdit.java d8f317c 
bq.    src/main/java/org/apache/hadoop/hbase/util/Bytes.java de8e40b 
bq.    src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestCompressor.java PRE-CREATION

bq.    src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestKeyValueCompression.java
PRE-CREATION 
bq.    src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLRUDictionary.java PRE-CREATION

bq.    src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALReplay.java a11899c 
bq.    src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALReplayCompressed.java
PRE-CREATION 
bq.  
bq.  Diff: https://reviews.apache.org/r/4328/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Michael
bq.  
bq.


                
> HLog Compression
> ----------------
>
>                 Key: HBASE-4608
>                 URL: https://issues.apache.org/jira/browse/HBASE-4608
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: Li Pi
>            Assignee: stack
>             Fix For: 0.94.0
>
>         Attachments: 4608-v19.txt, 4608-v20.txt, 4608-v22.txt, 4608v1.txt, 4608v13.txt,
4608v13.txt, 4608v14.txt, 4608v15.txt, 4608v16.txt, 4608v17.txt, 4608v18.txt, 4608v23.txt,
4608v24.txt, 4608v25.txt, 4608v27.txt, 4608v5.txt, 4608v6.txt, 4608v7.txt, 4608v8fixed.txt,
hbase-4608-v28-delta.txt, hbase-4608-v28.txt, hbase-4608-v28.txt
>
>
> The current bottleneck to HBase write speed is replicating the WAL appends across different
datanodes. We can speed up this process by compressing the HLog. Current plan involves using
a dictionary to compress table name, region id, cf name, and possibly other bits of repeated
data. Also, HLog format may be changed in other ways to produce a smaller HLog.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message