Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id EA5C7200D1E for ; Wed, 18 Oct 2017 11:38:07 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id E8F32160BEA; Wed, 18 Oct 2017 09:38:07 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 3AFDE1609EE for ; Wed, 18 Oct 2017 11:38:07 +0200 (CEST) Received: (qmail 96487 invoked by uid 500); 18 Oct 2017 09:38:06 -0000 Mailing-List: contact dev-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hbase.apache.org Delivered-To: mailing list dev@hbase.apache.org Received: (qmail 96474 invoked by uid 99); 18 Oct 2017 09:38:06 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 18 Oct 2017 09:38:06 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 6F516180635 for ; Wed, 18 Oct 2017 09:38:05 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -100.002 X-Spam-Level: X-Spam-Status: No, score=-100.002 tagged_above=-999 required=6.31 tests=[RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id Dk5JToEllb2c for ; Wed, 18 Oct 2017 09:38:04 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id F1E8860F1D for ; Wed, 18 Oct 2017 09:38:03 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id A9EEFE257D for ; Wed, 18 Oct 2017 09:38:02 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id A4029243AA for ; Wed, 18 Oct 2017 09:38:01 +0000 (UTC) Date: Wed, 18 Oct 2017 09:38:01 +0000 (UTC) From: "Guanghao Zhang (JIRA)" To: dev@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Created] (HBASE-19034) Implement "optimize SEEK to SKIP" in storefile scanner MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Wed, 18 Oct 2017 09:38:08 -0000 Guanghao Zhang created HBASE-19034: -------------------------------------- Summary: Implement "optimize SEEK to SKIP" in storefile scanner Key: HBASE-19034 URL: https://issues.apache.org/jira/browse/HBASE-19034 Project: HBase Issue Type: Bug Reporter: Guanghao Zhang {code} protected boolean trySkipToNextRow(Cell cell) throws IOException { Cell nextCell = null; do { Cell nextIndexedKey = getNextIndexedKey(); if (nextIndexedKey != null && nextIndexedKey != KeyValueScanner.NO_NEXT_INDEXED_KEY && matcher.compareKeyForNextRow(nextIndexedKey, cell) >= 0) { this.heap.next(); ++kvsScanned; } else { return false; } } while ((nextCell = this.heap.peek()) != null && CellUtil.matchingRows(cell, nextCell)); return true; } {code} When SQM return a SEEK_NEXT_ROW, the store scanner will seek to the cell from next row. HBASE-13109 optimized the SEEK to SKIP when we can read the cell in current loaded block. So it will skip by call heap.next to the cell from next row. But the problem is it compare too many times with the nextIndexedKey in the while loop. We plan move the compare outside the loop to reduce compare times. One problem is the nextIndexedKey maybe changed when call heap.peek, because the current storefile scanner was changed. So my proposal is to move the "optimize SEEK to SKIP" to storefile scanner. When we call seek for storefile scanner, it may real seek or implement seek by several times skip. Any suggestions are welcomed. Thanks. -- This message was sent by Atlassian JIRA (v6.4.14#64029)