Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 4D64672FE for ; Fri, 23 Dec 2011 06:36:04 +0000 (UTC) Received: (qmail 19210 invoked by uid 500); 23 Dec 2011 06:36:03 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 19147 invoked by uid 500); 23 Dec 2011 06:36:02 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 19139 invoked by uid 99); 23 Dec 2011 06:36:01 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 23 Dec 2011 06:36:01 +0000 X-ASF-Spam-Status: No, hits=-2002.5 required=5.0 tests=ALL_TRUSTED,RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 23 Dec 2011 06:35:54 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id 716911253BC for ; Fri, 23 Dec 2011 06:35:32 +0000 (UTC) Date: Fri, 23 Dec 2011 06:35:32 +0000 (UTC) From: "jiraposter@reviews.apache.org (Commented) (JIRA)" To: issues@hbase.apache.org Message-ID: <605051265.41811.1324622132466.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <1642646625.2855.1318891871332.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Commented] (HBASE-4608) HLog Compression MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HBASE-4608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13175310#comment-13175310 ] jiraposter@reviews.apache.org commented on HBASE-4608: ------------------------------------------------------ ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2740/#review4100 ----------------------------------------------------------- Maybe just do this for WALEdits/KeyValues for now and tackle HLogKey later. Looks like hash collisions in SimpleDictionary could be nasty. Other than that mostly whitespace. Cool stuff. src/main/java/org/apache/hadoop/hbase/regionserver/wal/CompressedKeyValue.java Should remove the year line. Also some extra whitespace in this file. src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogKey.java Bunch of whitespace in here. As said above, maybe do HLogKey in a separate jira. src/main/java/org/apache/hadoop/hbase/regionserver/wal/SequenceFileLogReader.java bunch of whitespace in here. src/main/java/org/apache/hadoop/hbase/regionserver/wal/SequenceFileLogReader.java whitespace src/main/java/org/apache/hadoop/hbase/regionserver/wal/SequenceFileLogReader.java I know this is not done, yet... But needs to be a fully qualified config name. src/main/java/org/apache/hadoop/hbase/regionserver/wal/SequenceFileLogReader.java LOG.debug? src/main/java/org/apache/hadoop/hbase/regionserver/wal/SequenceFileLogWriter.java Hardcoding SimpleDictionary here? src/main/java/org/apache/hadoop/hbase/regionserver/wal/SimpleDictionary.java year... src/main/java/org/apache/hadoop/hbase/regionserver/wal/SimpleDictionary.java What if you have a hash collision? You now overwrite the old value that just happens to have the same hash code. Is that OK? src/main/java/org/apache/hadoop/hbase/regionserver/wal/SimpleDictionary.java Here too; what happens for hash collisions? src/main/java/org/apache/hadoop/hbase/regionserver/wal/WALDictionary.java Year... And trailing whitespace in here. src/main/java/org/apache/hadoop/hbase/regionserver/wal/WALEdit.java bunch of extra leading whitespace in this file src/main/java/org/apache/hadoop/hbase/regionserver/wal/WALEdit.java Would sure be nice if we had a KeyValue interface and the implementations would just do the right thing. src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALReplay.java I assume you'll tests with/without compression. - Lars On 2011-12-23 06:00:24, Li Pi wrote: bq. bq. ----------------------------------------------------------- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/2740/ bq. ----------------------------------------------------------- bq. bq. (Updated 2011-12-23 06:00:24) bq. bq. bq. Review request for hbase, Eli Collins and Todd Lipcon. bq. bq. bq. Summary bq. ------- bq. bq. Heres what I have so far. Things are written, and "should work". I need to rework the test cases to test this, and put something in the config file to enable/disable. Obviously this isn't ready for commit at the moment, but I can get those two things done pretty quickly. bq. bq. Obviously the dictionary is incredibly simple at the moment, I'll come up with something cooler sooner. Let me know how this looks. bq. bq. bq. This addresses bug HBase-4608. bq. https://issues.apache.org/jira/browse/HBase-4608 bq. bq. bq. Diffs bq. ----- bq. bq. src/main/java/org/apache/hadoop/hbase/regionserver/wal/CompressedKeyValue.java PRE-CREATION bq. src/main/java/org/apache/hadoop/hbase/regionserver/wal/Compressor.java PRE-CREATION bq. src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java 24407af bq. src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogKey.java f067221 bq. src/main/java/org/apache/hadoop/hbase/regionserver/wal/SequenceFileLogReader.java d9cd6de bq. src/main/java/org/apache/hadoop/hbase/regionserver/wal/SequenceFileLogWriter.java cbef70f bq. src/main/java/org/apache/hadoop/hbase/regionserver/wal/SimpleDictionary.java PRE-CREATION bq. src/main/java/org/apache/hadoop/hbase/regionserver/wal/WALDictionary.java PRE-CREATION bq. src/main/java/org/apache/hadoop/hbase/regionserver/wal/WALEdit.java e1117ef bq. src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALReplay.java 59910bf bq. bq. Diff: https://reviews.apache.org/r/2740/diff bq. bq. bq. Testing bq. ------- bq. bq. bq. Thanks, bq. bq. Li bq. bq. > HLog Compression > ---------------- > > Key: HBASE-4608 > URL: https://issues.apache.org/jira/browse/HBASE-4608 > Project: HBase > Issue Type: New Feature > Reporter: Li Pi > Assignee: Li Pi > Attachments: 4608v1.txt > > > The current bottleneck to HBase write speed is replicating the WAL appends across different datanodes. We can speed up this process by compressing the HLog. Current plan involves using a dictionary to compress table name, region id, cf name, and possibly other bits of repeated data. Also, HLog format may be changed in other ways to produce a smaller HLog. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira