hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "jiraposter@reviews.apache.org (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-4608) HLog Compression
Date Wed, 09 Nov 2011 19:05:53 GMT

    [ https://issues.apache.org/jira/browse/HBASE-4608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13147225#comment-13147225
] 

jiraposter@reviews.apache.org commented on HBASE-4608:
------------------------------------------------------



bq.  On 2011-11-07 23:39:59, Lars Hofhansl wrote:
bq.  > Cool stuff.
bq.  > 
bq.  > I am probably just missing something... But when is the dictionary itself stored?
Don't we need to read out the logs again.
bq.  > 
bq.  > Just so I understand: We build up the dictionary as we go along. In the beginning
most things won't be in the dictionary, we write them out and add them to the dict, and from
that time on when we encounter them again we just write the index.
bq.  > On the read we could also build up the dict as we go along, because when values
weren't in the dictionary they where written into the file, so we can recreate the dictionary
as we read. Right?
bq.  > 
bq.  > (As I said, I am probably missing something).
bq.  > 
bq.  > See minor comments inline.
bq.  
bq.  Li Pi wrote:
bq.      You aren't missing anything! Thats exactly how it works.
bq.      
bq.      Each WAL starts off with a brand new shiny dictionary. We build up the dictionary
as we write, and when we read, we start off with a shiny new dictionary again. The dictionary
is recreated upon read.
bq.  
bq.  Lars Hofhansl wrote:
bq.      Ok... What I cannot find then, is the code that builds the dictionary during read
:)
bq.      
bq.      Also as a general concern... We write these WAL logs (in part) for redundancy. Compression
is the opposite of redundancy... So say, we garble the beginning of a WAL file, then the entire
file will be useless to us... I don't think that is a big deal, though. As the WAL entries
are variable length this is mostly true even today.
bq.
bq.  
bq.  Li Pi wrote:
bq.      Oops, somehow I deleted that line. There are comments for it. Added it back in.
bq.      
bq.      //if this isn't in the dictionary, we need to add to the dictionary.
bq.      
bq.      As for the more general concern: HBase won't return a write to the client until the
WALEdit write is completely done. So aborting midway won't be an issue - and even if we abort
midway, we can recover everything thats been written so far.
bq.      
bq.      For the beginning of the file getting garbled? - True but we'd lose some information
with or without compression. With compression we lose more information, but that's the nature
of compression. Recovering a partially garbled WAL fully is impossible no matter what approach
we use. Either way, its not a contingency the WAL is built to handle - a partial recovery
after all WAL replica's have been corrupted.
bq.  
bq.  Todd Lipcon wrote:
bq.      well, in the non-compressed WAL case, we can re-sync to a SequenceFile "SYNC" marker
and continue reading from there in the face of arbitrary corruption.
bq.      
bq.      Perhaps the compression mechanism should have some kind of "maximum lookback" - ie
when a dictionary is being built, keep the file offset where each dictionary word was used.
Then, when deciding to use a dict reference vs a literal, if the curOffset - lastUsedOffset
> MAX_LOOKBACK_THRESHOLD, we re-write the entry. This would bound the size of unrecoverable
WAL portions while still providing good compression (similar to what we have today)

That makes sense. Maybe file a separate jira and use this one to get the compression in?


- Lars


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2740/#review3093
-----------------------------------------------------------


On 2011-11-07 23:12:37, Li Pi wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/2740/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-11-07 23:12:37)
bq.  
bq.  
bq.  Review request for hbase, Eli Collins and Todd Lipcon.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  Heres what I have so far. Things are written, and "should work". I need to rework the
test cases to test this, and put something in the config file to enable/disable. Obviously
this isn't ready for commit at the moment, but I can get those two things done pretty quickly.
bq.  
bq.  Obviously the dictionary is incredibly simple at the moment, I'll come up with something
cooler sooner. Let me know how this looks.
bq.  
bq.  
bq.  This addresses bug HBase-4608.
bq.      https://issues.apache.org/jira/browse/HBase-4608
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    src/main/java/org/apache/hadoop/hbase/KeyValue.java e68e486 
bq.    src/main/java/org/apache/hadoop/hbase/regionserver/wal/CompressedKeyValue.java PRE-CREATION

bq.    src/main/java/org/apache/hadoop/hbase/regionserver/wal/SimpleDictionary.java PRE-CREATION

bq.    src/main/java/org/apache/hadoop/hbase/regionserver/wal/WALDictionary.java PRE-CREATION

bq.    src/main/java/org/apache/hadoop/hbase/regionserver/wal/WALEdit.java e1117ef 
bq.  
bq.  Diff: https://reviews.apache.org/r/2740/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Li
bq.  
bq.


                
> HLog Compression
> ----------------
>
>                 Key: HBASE-4608
>                 URL: https://issues.apache.org/jira/browse/HBASE-4608
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: Li Pi
>            Assignee: Li Pi
>         Attachments: 4608v1.txt
>
>
> The current bottleneck to HBase write speed is replicating the WAL appends across different
datanodes. We can speed up this process by compressing the HLog. Current plan involves using
a dictionary to compress table name, region id, cf name, and possibly other bits of repeated
data. Also, HLog format may be changed in other ways to produce a smaller HLog.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message