Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id A5C0D9BA2 for ; Sun, 23 Oct 2011 09:42:57 +0000 (UTC) Received: (qmail 77034 invoked by uid 500); 23 Oct 2011 09:42:57 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 76920 invoked by uid 500); 23 Oct 2011 09:42:55 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 76912 invoked by uid 99); 23 Oct 2011 09:42:55 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 23 Oct 2011 09:42:55 +0000 X-ASF-Spam-Status: No, hits=-2000.5 required=5.0 tests=ALL_TRUSTED,RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 23 Oct 2011 09:42:52 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id 247BC317714 for ; Sun, 23 Oct 2011 09:40:32 +0000 (UTC) Date: Sun, 23 Oct 2011 09:40:32 +0000 (UTC) From: "Ted Yu (Commented) (JIRA)" To: issues@hbase.apache.org Message-ID: <336334297.6393.1319362832150.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <1478910597.3820.1317665861348.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Commented] (HBASE-4532) Avoid top row seek by dedicated bloom filter for delete family bloom filter MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HBASE-4532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13133599#comment-13133599 ] Ted Yu commented on HBASE-4532: ------------------------------- TestHCM wasn't fixed. If the test fails consistently, maybe you can help debug it. For the other test failures, it seems ulimit on the machine performing tests has to be increased. > Avoid top row seek by dedicated bloom filter for delete family bloom filter > --------------------------------------------------------------------------- > > Key: HBASE-4532 > URL: https://issues.apache.org/jira/browse/HBASE-4532 > Project: HBase > Issue Type: Improvement > Reporter: Liyin Tang > Assignee: Liyin Tang > Attachments: D27.1.patch, D27.1.patch, HBASE-4532-apache-trunk.patch, hbase-4532-89-fb.patch > > > The previous jira, HBASE-4469, is to avoid the top row seek operation if row-col bloom filter is enabled. > This jira tries to avoid top row seek for all the cases by creating a dedicated bloom filter only for delete family > The only subtle use case is when we are interested in the top row with empty column. > For example, > we are interested in row1/cf1:/1/put. > So we seek to the top row: row1/cf1:/MAX_TS/MAXIMUM. And the delete family bloom filter will say there is NO delete family. > Then it will avoid the top row seek and return a fake kv, which is the last kv for this row (createLastOnRowCol). > In this way, we have already missed the real kv we are interested in. > The solution for the above problem is to disable this optimization if we are trying to GET/SCAN a row with empty column. > Evaluation from TestSeekOptimization: > Previously: > For bloom=NONE, compr=NONE total seeks without optimization: 2506, with optimization: 1714 (68.40%), savings: 31.60% > For bloom=ROW, compr=NONE total seeks without optimization: 2506, with optimization: 1714 (68.40%), savings: 31.60% > For bloom=ROWCOL, compr=NONE total seeks without optimization: 2506, with optimization: 1458 (58.18%), savings: 41.82% > For bloom=NONE, compr=GZ total seeks without optimization: 2506, with optimization: 1714 (68.40%), savings: 31.60% > For bloom=ROW, compr=GZ total seeks without optimization: 2506, with optimization: 1714 (68.40%), savings: 31.60% > For bloom=ROWCOL, compr=GZ total seeks without optimization: 2506, with optimization: 1458 (58.18%), savings: 41.82% > So we can get about 10% more seek savings ONLY if the ROWCOL bloom filter is enabled.[HBASE-4469] > ================================================ > After this change: > For bloom=NONE, compr=NONE total seeks without optimization: 2506, with optimization: 1458 (58.18%), savings: 41.82% > For bloom=ROW, compr=NONE total seeks without optimization: 2506, with optimization: 1458 (58.18%), savings: 41.82% > For bloom=ROWCOL, compr=NONE total seeks without optimization: 2506, with optimization: 1458 (58.18%), savings: 41.82% > For bloom=NONE, compr=GZ total seeks without optimization: 2506, with optimization: 1458 (58.18%), savings: 41.82% > For bloom=ROW, compr=GZ total seeks without optimization: 2506, with optimization: 1458 (58.18%), savings: 41.82% > For bloom=ROWCOL, compr=GZ total seeks without optimization: 2506, with optimization: 1458 (58.18%), savings: 41.82% > So we can get about 10% more seek savings for ALL kinds of bloom filter. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira