hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Raymond Liu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-8001) Avoid unnecessary lazy seek
Date Fri, 08 Mar 2013 07:08:15 GMT

    [ https://issues.apache.org/jira/browse/HBASE-8001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13596907#comment-13596907
] 

Raymond Liu commented on HBASE-8001:
------------------------------------

If it is all blockcache, then performance difference of lazy seek v.s. real seek should have
been even smaller?  since then without need to load data, compare to lazy seek, a real seek
actually just do an extra block position op. this should not cost much time. And in my test,
when Compare family only scan v.s. family+column scan, the majority part of extra time cost
is on Create fake key and compare keys between scanner. Then, when in lazy seek path it need
to construct two fake key and more compare, while real seek path only need one fake key and
less compare. Should save around 1/3 of time. So, might be some other fact we are miss out?
                
> Avoid unnecessary lazy seek
> ---------------------------
>
>                 Key: HBASE-8001
>                 URL: https://issues.apache.org/jira/browse/HBASE-8001
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>    Affects Versions: 0.94.5
>            Reporter: Raymond Liu
>            Assignee: Raymond Liu
>             Fix For: 0.98.0
>
>         Attachments: HBASE-8001_onescanner.patch, HBASE-8001_onescanner_v2.patch
>
>
> Lazy seek helps to reduce the real seek needed for multi hfile, when the kv from newer
hfile is enough to satisfy the query.
> While in many case, it just push the real seek later, and do not reduce the number of
real seek. e.g. there are only one hfile, or storefilescanner is closed and only one left,
or the scan need to go through all the versions, or there are only one version of row and
a sequence scan is performed. In these case, lazy seek just bring extra overhead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message