hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eshcar Hillel (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-17339) Scan-Memory-First Optimization for Get Operation
Date Sat, 31 Dec 2016 20:51:59 GMT

    [ https://issues.apache.org/jira/browse/HBASE-17339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15790029#comment-15790029
] 

Eshcar Hillel commented on HBASE-17339:
---------------------------------------

Thanks [~enis].
What is the advantage of checking monotonicity after getting the results  (step 3 in your
solution) over checking it before opening memory scanners (step 0 in my solution)? 
As I see it doing the test as the first step has at least 3 advantages:
1) we avoid opening HFile scanners when they are not needed (when the result from memory is
complete, execution 0-1-2)
2) we avoid running the speculative memory-only scan when it is clear that it will not suffice
(execution 0-5-6)
3) The solution is simpler and only requires adding a maxFlushedTS attribute to the memstore
which is updated upon a flush.

The test which checks whether the solution is complete before step 3 verifies that there are
enough versions as required by the scan for each qualifier in the scan. If the scan does not
define specific qualifiers, the optimization is not invoked, and a full scan is executed.
Do you think there are more conditions to check?

> Scan-Memory-First Optimization for Get Operation
> ------------------------------------------------
>
>                 Key: HBASE-17339
>                 URL: https://issues.apache.org/jira/browse/HBASE-17339
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Eshcar Hillel
>         Attachments: HBASE-17339-V01.patch
>
>
> The current implementation of a get operation (to retrieve values for a specific key)
scans through all relevant stores of the region; for each store both memory components (memstores
segments) and disk components (hfiles) are scanned in parallel.
> We suggest to apply an optimization that speculatively scans memory-only components first
and only if the result is incomplete scans both memory and disk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message