Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id EB30BD7F6 for ; Sun, 17 Mar 2013 02:36:14 +0000 (UTC) Received: (qmail 78689 invoked by uid 500); 17 Mar 2013 02:36:13 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 78561 invoked by uid 500); 17 Mar 2013 02:36:13 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 78530 invoked by uid 99); 17 Mar 2013 02:36:12 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 17 Mar 2013 02:36:12 +0000 Date: Sun, 17 Mar 2013 02:36:12 +0000 (UTC) From: "Lars Hofhansl (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HBASE-8001) Avoid unnecessary lazy seek MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-8001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13604484#comment-13604484 ] Lars Hofhansl commented on HBASE-8001: -------------------------------------- I would really like to understand what specific scenario this is optimizing. Since you're doing an M/R job, maybe this is improving only with multiple threads? In a single threaded env with a single RS and DN only, I cannot discern any improvement from this, so it must be somewhere else. > Avoid unnecessary lazy seek > --------------------------- > > Key: HBASE-8001 > URL: https://issues.apache.org/jira/browse/HBASE-8001 > Project: HBase > Issue Type: Improvement > Components: regionserver > Affects Versions: 0.94.5 > Reporter: Raymond Liu > Assignee: Raymond Liu > Fix For: 0.98.0 > > Attachments: HBASE-8001_onescanner.patch, HBASE-8001_onescanner_v2.patch > > > Lazy seek helps to reduce the real seek needed for multi hfile, when the kv from newer hfile is enough to satisfy the query. > While in many case, it just push the real seek later, and do not reduce the number of real seek. e.g. there are only one hfile, or storefilescanner is closed and only one left, or the scan need to go through all the versions, or there are only one version of row and a sequence scan is performed. In these case, lazy seek just bring extra overhead. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira