Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 9EF16E076 for ; Thu, 28 Feb 2013 05:05:15 +0000 (UTC) Received: (qmail 53363 invoked by uid 500); 28 Feb 2013 05:05:15 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 53274 invoked by uid 500); 28 Feb 2013 05:05:14 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 53188 invoked by uid 99); 28 Feb 2013 05:05:13 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 28 Feb 2013 05:05:13 +0000 Date: Thu, 28 Feb 2013 05:05:13 +0000 (UTC) From: "stack (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HBASE-7952) Remove update() and Improve ExplicitColumnTracker performance. MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-7952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13589188#comment-13589188 ] stack commented on HBASE-7952: ------------------------------ [~colorant] Thank you for digging into the history. The 'early-out' was for get-of-most-recent version; was thought that just looking in the first storeflie would in many cases be optimal is only one version required. I've not looked at the code in a while. This form of 'early-out' may indeed have been dropped. So, should you add a big "PRESUMES COLUMNS ARE COMING IN ORDER' note at the top of the tracker? And, can you think of a unit test that where we could have the columns come out of order given some layout in the storefiles? Looks like you need to edit more of the javadoc. Javadoc says 'two methods' still though you now are removing one. Hate to be a stickler on this but this stuff is complicated enough and an out-of-place javadoc could throw off newcomers. Is this a good way to do done when say a row does NOT have all the columns explicitly asked for? public boolean done() { - return this.columns.size() == 0; + return this.index >= this.columns.size(); } This looks like a good change; - if(this.columns.size() == 0) { + if(done()) { (What was there before didn't help) This change looks good too... - // Note: because we are done with this column, and are removing - // it from columns, we don't do a ++this.index. The index stays - // the same but the columns have shifted within the array such - // that index now points to the next column we are interested in. - this.columns.remove(this.index); - + ++this.index; What was there previous defied the way folks normally think about these things. Sorry, I don't know this code well enough. What you are doing looks good. Is there good test coverage for this stuff do you know? > Remove update() and Improve ExplicitColumnTracker performance. > -------------------------------------------------------------- > > Key: HBASE-7952 > URL: https://issues.apache.org/jira/browse/HBASE-7952 > Project: HBase > Issue Type: Improvement > Components: regionserver > Affects Versions: 0.94.1, 0.94.5 > Reporter: Raymond Liu > Assignee: Raymond Liu > Fix For: 0.96.0 > > Attachments: HBASE_7952.patch > > > In ColumnTracker.java, the update() method is not used by anyone now. And no one will call checkColumn for different HFiles with update() in between files to re-walk through the target columns. All columns will be feed to checkColumn() in order. > So, within ExplicitColumnTracker, the target columns can be optimized to not dynamic maintain a changing list of columns yet to match. Instead, just move index through it is enough. > with this optimization to save the time for avoid reconstruct a columns array upon each row, the checkColumn method's performance could be improved by 10-20%. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira