Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 93423101E6 for ; Fri, 6 Sep 2013 20:47:52 +0000 (UTC) Received: (qmail 26548 invoked by uid 500); 6 Sep 2013 20:47:52 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 26517 invoked by uid 500); 6 Sep 2013 20:47:52 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 26503 invoked by uid 99); 6 Sep 2013 20:47:52 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 06 Sep 2013 20:47:52 +0000 Date: Fri, 6 Sep 2013 20:47:52 +0000 (UTC) From: "Matt Corgan (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HBASE-9440) Pass blocks of KVs from HFile scanner to the StoreFileScanner and up MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-9440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13760600#comment-13760600 ] Matt Corgan commented on HBASE-9440: ------------------------------------ It's a somewhat advanced optimization, but I've always hoped to see block level transfer of data like this. Both for compactions and long scans. For compactions it's probably quite often that all the cells in a block will remain contiguous, in which case you could save the decompression, decoding, heap logic, encoding, compression steps. Just hand the byte[] through to the new file. For the client case, maybe make it a setting to bring whole blocks back to the client (as soon as any part of a block is needed) and do filtering logic client-side. > Pass blocks of KVs from HFile scanner to the StoreFileScanner and up > -------------------------------------------------------------------- > > Key: HBASE-9440 > URL: https://issues.apache.org/jira/browse/HBASE-9440 > Project: HBase > Issue Type: Bug > Reporter: Lars Hofhansl > > Currently we read KVs from an HFileScanner one-by-one and pass them up the scanner/heap tree. Many time the ranges of KVs retrieved from StoreFileScanner (by StoreScanners) and HFileScanner (by StoreFileScanner) will be non-overlapping. If chunks of KVs do not overlap we can sort entire chunks just by comparing the start/end key of the chunk. Only if chunks are overlapping do we need to sort KV by KV as we do now. > I have no patch, but I wanted to float this idea. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira