hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sergey Shelukhin (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-10241) implement mvcc-consistent scanners (across recovery)
Date Fri, 27 Dec 2013 18:33:50 GMT

    [ https://issues.apache.org/jira/browse/HBASE-10241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13857617#comment-13857617

Sergey Shelukhin commented on HBASE-10241:

bq. On mvcc giving consistent view on region, that is unnecessary, right – when would we
ever care about a consistent view across a region rather than just across a row (other than
the fact that row boundaries are only known after the fact, after you have passed them out)
It can actually be pretty important... if recovery takes a while and scanner bounces the data
read can be several  minutes apart. For certain use cases it's much better to have consistent
data for close rows (esp. if some sharded data is stored). Also, if secondary reads are implemented
the divergence between scanners can be even greater, so the negative effects of scanner "jumping"
will be even more visible. Then, as suggested above, by querying mvcc from all requisite regions
before scanner runs we can make it even more reasonable.
Then it becomes as close as you can get to consistent view of the data without implementing
something like Percolator, with external timestamps. Which is pretty neat :)

> implement mvcc-consistent scanners (across recovery)
> ----------------------------------------------------
>                 Key: HBASE-10241
>                 URL: https://issues.apache.org/jira/browse/HBASE-10241
>             Project: HBase
>          Issue Type: New Feature
>          Components: HFile, regionserver, Scanners
>    Affects Versions: 0.99.0
>            Reporter: Sergey Shelukhin
>            Assignee: Sergey Shelukhin
>         Attachments: Consistent scanners.pdf
> Scanners currently use mvcc for consistency. However, mvcc is lost on server restart,
or even a region move. This JIRA is to enable the scanners to transfer mvcc (or seqId, or
some other number, see HBASE-8763) between servers. First, client scanner needs to get and
store the readpoint. Second, mvcc needs to be preserved in WAL. Third, the mvcc needs to be
stored in store files per KV and discarded when not needed.

This message was sent by Atlassian JIRA

View raw message