hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Enis Soztutar (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-17339) Scan-Memory-First Optimization for Get Operation
Date Thu, 29 Dec 2016 02:11:58 GMT

    [ https://issues.apache.org/jira/browse/HBASE-17339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15784266#comment-15784266

Enis Soztutar commented on HBASE-17339:

The main problem is how to determine " ONLY if result is not complete". When you get the result
of a row from memory, it can happen that some other store file contains a higher version,
and you will miss it unless we have the monotonically increasing timestamps guarantee. 

However, we already have min - max timestamps per store file tracked, and we have logic to
eliminate scanners based on min/max timestamps. We can do this algorithm for correctness:

 1. open all relevant  *memory* scanners
 2. get results
 3. If get returns a result
   check the timestamp against all remaining scanners (KeyValueScanner.shouldUseScanner()).
 if all (hfile) scanners have less timestamps, return results. 
  open all scanners 
  return results 

This will ensure correctness without having to rely on a promise from the user. 

> Scan-Memory-First Optimization for Get Operation
> ------------------------------------------------
>                 Key: HBASE-17339
>                 URL: https://issues.apache.org/jira/browse/HBASE-17339
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Eshcar Hillel
>         Attachments: HBASE-17339-V01.patch
> The current implementation of a get operation (to retrieve values for a specific key)
scans through all relevant stores of the region; for each store both memory components (memstores
segments) and disk components (hfiles) are scanned in parallel.
> We suggest to apply an optimization that speculatively scans memory-only components first
and only if the result is incomplete scans both memory and disk.

This message was sent by Atlassian JIRA

View raw message