hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ted Yu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-8316) JoinedHeap for essential column families should reseek instead of seek
Date Wed, 10 Apr 2013 05:24:16 GMT

    [ https://issues.apache.org/jira/browse/HBASE-8316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13627494#comment-13627494
] 

Ted Yu commented on HBASE-8316:
-------------------------------

Here is the javadoc for requestSeek():
{code}
   * Similar to {@link #seek} (or {@link #reseek} if forward is true) but only
   * does a seek operation after checking that it is really necessary for the
   * row/column combination specified by the kv parameter. This function was
   * added to avoid unnecessary disk seeks by checking row-column Bloom filters
   * before a seek on multi-column get/scan queries, and to optimize by looking
   * up more recent files first.
{code}
Looks like requestSeek() should perform better.
                
> JoinedHeap for essential column families should reseek instead of seek
> ----------------------------------------------------------------------
>
>                 Key: HBASE-8316
>                 URL: https://issues.apache.org/jira/browse/HBASE-8316
>             Project: HBase
>          Issue Type: Sub-task
>          Components: Filters, Performance, regionserver
>            Reporter: Lars Hofhansl
>             Fix For: 0.98.0, 0.94.7, 0.95.1
>
>         Attachments: 8316-0.94.txt, 8316-0.96.txt, 8316-trunk.txt
>
>
> This was raised by the Phoenix team. During a profiling session we noticed that catching
the joinedHeap up to the current rows via seek causes a performance regression, which makes
the joinedHeap only efficient when either a high or low percentage is matched by the filter.
> (High is fine, because the joinedHeap will not get behind as often and does not need
to be caught up, low is fine, because the seek isn't happening frequently).
> In our tests we found that the solution is quite simple: Replace seek with reseek. Patch
coming soon.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message