hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Enis Soztutar (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-17339) Scan-Memory-First Optimization for Get Operation
Date Thu, 29 Dec 2016 02:11:58 GMT

    [ https://issues.apache.org/jira/browse/HBASE-17339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15784266#comment-15784266
] 

Enis Soztutar commented on HBASE-17339:
---------------------------------------

The main problem is how to determine " ONLY if result is not complete". When you get the result
of a row from memory, it can happen that some other store file contains a higher version,
and you will miss it unless we have the monotonically increasing timestamps guarantee. 

However, we already have min - max timestamps per store file tracked, and we have logic to
eliminate scanners based on min/max timestamps. We can do this algorithm for correctness:

{code}
 1. open all relevant  *memory* scanners
 2. get results
 3. If get returns a result
   check the timestamp against all remaining scanners (KeyValueScanner.shouldUseScanner()).
 if all (hfile) scanners have less timestamps, return results. 
 else 
  open all scanners 
  return results 
{code}

This will ensure correctness without having to rely on a promise from the user. 

> Scan-Memory-First Optimization for Get Operation
> ------------------------------------------------
>
>                 Key: HBASE-17339
>                 URL: https://issues.apache.org/jira/browse/HBASE-17339
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Eshcar Hillel
>         Attachments: HBASE-17339-V01.patch
>
>
> The current implementation of a get operation (to retrieve values for a specific key)
scans through all relevant stores of the region; for each store both memory components (memstores
segments) and disk components (hfiles) are scanned in parallel.
> We suggest to apply an optimization that speculatively scans memory-only components first
and only if the result is incomplete scans both memory and disk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message