Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 4B764200C21 for ; Mon, 20 Feb 2017 09:12:50 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 4A0B7160B73; Mon, 20 Feb 2017 08:12:50 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 8E1BD160B62 for ; Mon, 20 Feb 2017 09:12:49 +0100 (CET) Received: (qmail 84755 invoked by uid 500); 20 Feb 2017 08:12:47 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 84744 invoked by uid 99); 20 Feb 2017 08:12:47 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 20 Feb 2017 08:12:47 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 295611A0351 for ; Mon, 20 Feb 2017 08:12:47 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -1.999 X-Spam-Level: X-Spam-Status: No, score=-1.999 tagged_above=-999 required=6.31 tests=[KAM_LAZY_DOMAIN_SECURITY=1, RP_MATCHES_RCVD=-2.999] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id sTkJuFDZJ5Si for ; Mon, 20 Feb 2017 08:12:46 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id AF76D5FBF2 for ; Mon, 20 Feb 2017 08:12:45 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id DA931E0834 for ; Mon, 20 Feb 2017 08:12:44 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 1609E24123 for ; Mon, 20 Feb 2017 08:12:44 +0000 (UTC) Date: Mon, 20 Feb 2017 08:12:44 +0000 (UTC) From: "Eshcar Hillel (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HBASE-17655) Removing MemStoreScanner and SnapshotScanner MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Mon, 20 Feb 2017 08:12:50 -0000 [ https://issues.apache.org/jira/browse/HBASE-17655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15874183#comment-15874183 ] Eshcar Hillel commented on HBASE-17655: --------------------------------------- Thanks, [~ram_krish]. While working on HBASE-17339 it occurred to me that it is very difficult to debug the scanners when there are layers on layers of scanners and key-value heaps. Some simplification is required here to have a maintainable code. > Removing MemStoreScanner and SnapshotScanner > -------------------------------------------- > > Key: HBASE-17655 > URL: https://issues.apache.org/jira/browse/HBASE-17655 > Project: HBase > Issue Type: Improvement > Components: Scanners > Affects Versions: 2.0.0 > Reporter: Eshcar Hillel > Assignee: Eshcar Hillel > Attachments: HBASE-17655-V01.patch > > > With CompactingMemstore becoming the new default, a store comprises multiple memory segments and not just 1-2. MemStoreScanner encapsulates the scanning of segments in the memory part of the store. SnapshotScanner is used to scan the snapshot segment upon flush to disk. > Having the logic of scanners scattered in multiple classes (StoreScanner, SegmentScanner, MemStoreScanner, SnapshotScanner) makes maintainance and debugging challenging tasks, not always for a good reason. > For example, MemStoreScanner has a KeyValueHeap (KVH). When creating the store scanner which also has a KVH, this makes a KVH inside a KVH. Reasoning about the correctness of the methods supported by the scanner (seek, next, hasNext, peek, etc.) is hard and debugging them is cumbersome. > In addition, by removing the MemStoreScanner layer we allow store scanner to filter out each one of the memory scanners instead of either taking them all (in most cases) or discarding them all (rarely). > SnapshotScanner is a simplified version of SegmentScanner as it is used only in a specific context. However it is an additional implementation of the same logic with no real advantage of improved performance. > Therefore, I suggest removing both MemStoreScanner and SnapshotScanner. The code is adjusted to handle the list of segment scanners they encapsulate. > This fits well with the current code since in most cases at some point a list of scanner is expected, so passing the actual list of segment scanners is more natural than wrapping a single (high level) scanner with Collections.singeltonList(...). -- This message was sent by Atlassian JIRA (v6.3.15#6346)