hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-7952) Remove update() and Improve ExplicitColumnTracker performance.
Date Thu, 28 Feb 2013 05:05:13 GMT

    [ https://issues.apache.org/jira/browse/HBASE-7952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13589188#comment-13589188

stack commented on HBASE-7952:

[~colorant] Thank you for digging into the history.  The 'early-out' was for get-of-most-recent
version; was thought that just looking in the first storeflie would in many cases be optimal
is only one version required.  I've not looked at the code in a while.  This form of 'early-out'
may indeed have been dropped.

So, should you add a big "PRESUMES COLUMNS ARE COMING IN ORDER' note at the top of the tracker?
  And, can you think of a unit test that where we could have the columns come out of order
given some layout in the storefiles?

Looks like you need to edit more of the javadoc.  Javadoc says 'two methods' still though
you now are removing one.  Hate to be a stickler on this but this stuff is complicated enough
and an out-of-place javadoc could throw off newcomers.

Is this a good way to do done when say a row does NOT have all the columns explicitly asked

   public boolean done() {
-    return this.columns.size() == 0;
+    return this.index >= this.columns.size();

This looks like a good change;

-      if(this.columns.size() == 0) {
+      if(done()) {

(What was there before didn't help)

This change looks good too...

-          // Note: because we are done with this column, and are removing
-          // it from columns, we don't do a ++this.index. The index stays
-          // the same but the columns have shifted within the array such
-          // that index now points to the next column we are interested in.
-          this.columns.remove(this.index);
+          ++this.index;

What was there previous defied the way folks normally think about these things.

Sorry, I don't know this code well enough.  What you are doing looks good.  Is there good
test coverage for this stuff do you know?

> Remove update() and Improve ExplicitColumnTracker performance.
> --------------------------------------------------------------
>                 Key: HBASE-7952
>                 URL: https://issues.apache.org/jira/browse/HBASE-7952
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>    Affects Versions: 0.94.1, 0.94.5
>            Reporter: Raymond Liu
>            Assignee: Raymond Liu
>             Fix For: 0.96.0
>         Attachments: HBASE_7952.patch
> In ColumnTracker.java, the update() method is not used by anyone now. And no one will
call checkColumn for different HFiles with update() in between files to re-walk through the
target columns. All columns will be feed to checkColumn() in order.
> So, within ExplicitColumnTracker, the target columns can be optimized to not dynamic
maintain a changing list of columns yet to match. Instead, just move index through it is enough.
> with this optimization to save the time for avoid reconstruct a columns array upon each
row, the checkColumn method's performance could be improved by 10-20%.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message