Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id B1287200C55 for ; Thu, 13 Apr 2017 10:06:46 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id AFB11160B98; Thu, 13 Apr 2017 08:06:46 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 0300C160B8B for ; Thu, 13 Apr 2017 10:06:45 +0200 (CEST) Received: (qmail 23412 invoked by uid 500); 13 Apr 2017 08:06:45 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 23401 invoked by uid 99); 13 Apr 2017 08:06:44 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 13 Apr 2017 08:06:44 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 8D70DCFED6 for ; Thu, 13 Apr 2017 08:06:44 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -100.002 X-Spam-Level: X-Spam-Status: No, score=-100.002 tagged_above=-999 required=6.31 tests=[RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id FkPL2iIEBz1V for ; Thu, 13 Apr 2017 08:06:42 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id 6DAC25FDA3 for ; Thu, 13 Apr 2017 08:06:42 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 04CE0E0C15 for ; Thu, 13 Apr 2017 08:06:41 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id A89A224069 for ; Thu, 13 Apr 2017 08:06:41 +0000 (UTC) Date: Thu, 13 Apr 2017 08:06:41 +0000 (UTC) From: "Duo Zhang (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HBASE-17910) Use separated StoreFileReader for streaming read MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Thu, 13 Apr 2017 08:06:46 -0000 [ https://issues.apache.org/jira/browse/HBASE-17910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15967268#comment-15967268 ] Duo Zhang commented on HBASE-17910: ----------------------------------- I do not think this modification will have a big impact on the number of active file handlers? In most cases we will use pread I think. And anpther problem is the connection number to datanode, this is a problem for the dfs client, we need to introduce a new multiplex protocol to communicate with datanode. > Use separated StoreFileReader for streaming read > ------------------------------------------------ > > Key: HBASE-17910 > URL: https://issues.apache.org/jira/browse/HBASE-17910 > Project: HBase > Issue Type: Bug > Reporter: Duo Zhang > > For now we have already supportted using private readers for compaction, by creating a new StoreFile copy. I think a better way is to allow creating multiple readers from a single StoreFile instance, thus we can avoid the ugly cloning, and the reader can also be used for streaming scan, not only for compaction. > The reason we want to do this is that, we found a read amplification when using short circult read. {{BlockReaderLocal}} will use an internal buffer to read data first, the buffer size is based on the configured buffer size and the readahead option in CachingStrategy. For normal pread request, we should just bypass the buffer, this can be achieved by setting readahead to 0. But for streaming read I think the buffer is somehow still useful? So we need to use different FSDataInputStream for pread and streaming read. > And one more thing is that, we can also remove the streamLock if streaming read always use its own reader. -- This message was sent by Atlassian JIRA (v6.3.15#6346)