Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id DDF221057E for ; Thu, 27 Feb 2014 20:22:26 +0000 (UTC) Received: (qmail 21859 invoked by uid 500); 27 Feb 2014 20:22:22 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 21818 invoked by uid 500); 27 Feb 2014 20:22:21 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 21804 invoked by uid 99); 27 Feb 2014 20:22:21 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 27 Feb 2014 20:22:21 +0000 Date: Thu, 27 Feb 2014 20:22:21 +0000 (UTC) From: "Lars Hofhansl (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HBASE-10625) Remove unnecessary key compare from AbstractScannerV2.reseekTo MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13915006#comment-13915006 ] Lars Hofhansl commented on HBASE-10625: --------------------------------------- bq. Test with adjacent cols like 4th and 5th? Did that, it is still faster, but to a lesser degree. (In the end, the best solution here would be to only (re)seek when we can expect the KV we're looking for to be in another HFile block, otherwise do a series of next calls. I have not found a clean way, yet, of doing that.) > Remove unnecessary key compare from AbstractScannerV2.reseekTo > -------------------------------------------------------------- > > Key: HBASE-10625 > URL: https://issues.apache.org/jira/browse/HBASE-10625 > Project: HBase > Issue Type: Bug > Reporter: Lars Hofhansl > Attachments: 10625-0.94.txt, 10625-trunk.txt > > > In reseekTo we find this > {code} > ... > compared = compareKey(reader.getComparator(), key, offset, length); > if (compared < 1) { > // If the required key is less than or equal to current key, then > // don't do anything. > return compared; > } else { > ... > return loadBlockAndSeekToKey(this.block, this.nextIndexedKey, > false, key, offset, length, false); > ... > {code} > loadBlockAndSeekToKey already does the right thing when a we pass a key that sorts before the current key. It's less efficient than this early check, but in the vast (all?) cases we pass forward keys (as required by the reseek contract). We're optimizing the wrong thing. > Scanning with the ExplicitColumnTracker is 20-30% faster. > (I tested with rows of 5 short KVs selected the 2nd and or 4th column) > I propose simply removing that check. -- This message was sent by Atlassian JIRA (v6.1.5#6160)