Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 71963102D0 for ; Wed, 12 Mar 2014 00:50:47 +0000 (UTC) Received: (qmail 42249 invoked by uid 500); 12 Mar 2014 00:50:46 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 42157 invoked by uid 500); 12 Mar 2014 00:50:46 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 42144 invoked by uid 99); 12 Mar 2014 00:50:45 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 12 Mar 2014 00:50:45 +0000 Date: Wed, 12 Mar 2014 00:50:45 +0000 (UTC) From: "Hudson (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HBASE-9778) Add hint to ExplicitColumnTracker to avoid seeking MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-9778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13931211#comment-13931211 ] Hudson commented on HBASE-9778: ------------------------------- SUCCESS: Integrated in hbase-0.96-hadoop2 #235 (See [https://builds.apache.org/job/hbase-0.96-hadoop2/235/]) HBASE-9778 Add hint to ExplicitColumnTracker to avoid seeking (larsh: rev 1576462) * /hbase/branches/0.96/hbase-client/src/main/java/org/apache/hadoop/hbase/client/Scan.java * /hbase/branches/0.96/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java * /hbase/branches/0.96/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java * /hbase/branches/0.96/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java * /hbase/branches/0.96/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestQueryMatcher.java * /hbase/branches/0.96/src/main/docbkx/performance.xml > Add hint to ExplicitColumnTracker to avoid seeking > -------------------------------------------------- > > Key: HBASE-9778 > URL: https://issues.apache.org/jira/browse/HBASE-9778 > Project: HBase > Issue Type: Bug > Reporter: Lars Hofhansl > Assignee: Lars Hofhansl > Fix For: 0.96.2, 0.98.1, 0.99.0, 0.94.18 > > Attachments: 9778-0.94-v2.txt, 9778-0.94-v3.txt, 9778-0.94-v4.txt, 9778-0.94-v5.txt, 9778-0.94-v6.txt, 9778-0.94-v7.txt, 9778-0.94-v8.txt, 9778-0.94-v9.txt, 9778-0.94.txt, 9778-trunk-v2.txt, 9778-trunk-v3.txt, 9778-trunk-v6.txt, 9778-trunk-v7.txt, 9778-trunk-v8.txt, 9778-trunk-v9.txt, 9778-trunk.txt > > > The issue of slow seeking in ExplicitColumnTracker was brought up by [~vrodionov] on the dev list. > My idea here is to avoid the seeking if we know that there aren't many versions to skip. > How do we know? We'll use the column family's VERSIONS setting as a hint. If VERSIONS is set to 1 (or maybe some value < 10) we'll avoid the seek and call SKIP repeatedly. > HBASE-9769 has some initial number for this approach: > Interestingly it depends on which column(s) is (are) selected. > Some numbers: 4m rows, 5 cols each, 1 cf, 10 bytes values, VERSIONS=1, everything filtered at the server with a ValueFilter. Everything measured in seconds. > Without patch: > ||Wildcard||Col 1||Col 2||Col 4||Col 5||Col 2+4|| > |6.4|8.5|14.3|14.6|11.1|20.3| > With patch: > ||Wildcard||Col 1||Col 2||Col 4||Col 5||Col 2+4|| > |6.4|8.4|8.9|9.9|6.4|10.0| > Variation here was +- 0.2s. > So with this patch scanning is 2x faster than without in some cases, and never slower. No special hint needed, beyond declaring VERSIONS correctly. -- This message was sent by Atlassian JIRA (v6.2#6252)