Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 83390DEB8 for ; Tue, 20 Nov 2012 05:16:13 +0000 (UTC) Received: (qmail 81679 invoked by uid 500); 20 Nov 2012 05:16:11 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 81616 invoked by uid 500); 20 Nov 2012 05:16:10 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 77810 invoked by uid 99); 20 Nov 2012 05:16:04 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 20 Nov 2012 05:16:04 +0000 Date: Tue, 20 Nov 2012 05:16:04 +0000 (UTC) From: "Lars Hofhansl (JIRA)" To: issues@hbase.apache.org Message-ID: <708808098.5264.1353388564344.JavaMail.jiratomcat@arcas> In-Reply-To: <778509887.128709.1353220812561.JavaMail.jiratomcat@arcas> Subject: [jira] [Updated] (HBASE-7180) RegionScannerImpl.next() is inefficient. MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-7180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-7180: --------------------------------- Attachment: 7180-0.94-v1.txt Slightly better patch. Also changes the call in RegionServer to use the cheaper version of next(). Looking around at the code, we can also replace all the calls from the AggregationImplementation to use this cheaper next() method. > RegionScannerImpl.next() is inefficient. > ---------------------------------------- > > Key: HBASE-7180 > URL: https://issues.apache.org/jira/browse/HBASE-7180 > Project: HBase > Issue Type: Bug > Reporter: Lars Hofhansl > Attachments: 7180-0.94-SKETCH.txt, 7180-0.94-v1.txt > > > We just came across a special scenario. > For our Phoenix project (SQL runtime for HBase), we push a lot of work into HBase via coprocessors. One method is to wrap RegionScanner in coprocessor hooks and then do processing in the hook to avoid returning a lot of data to the client unnecessarily. > In this specific case this is pretty bad. Since the wrapped RegionScanner's next() does not "know" that it is called this way is still does all of this on each invocation: > # Starts a RegionOperation > # Increments the request count > # set the current read point on a thread local (because generally each call could come from a different thread) > # Finally does the next on its StoreScanner(s) > # Ends the RegionOperation > When this is done in a tight loop millions of times (as is the case for us) it starts to become significant. > Not sure what to do about this, really. Opening this issue for discussion. > One way is to extend the RegionScanner with an "internal" next() method of sorts, so that all this overhead can be avoided. The coprocessor could call the regular next() methods once and then just call the cheaper internal version. > Are there better/cleaner ways? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira