hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ryan rawson (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-3498) Memstore scanner needs new semantics, which may require new data structure
Date Wed, 02 Feb 2011 23:16:29 GMT

    [ https://issues.apache.org/jira/browse/HBASE-3498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12989859#comment-12989859

ryan rawson commented on HBASE-3498:

After talking to stack about the unstable peek(), this might be ok.  

Consider, we have a scanner at the 'current' position that has a peek()ed value of 'C'.  Now
later on when we peek() again we get 'B' or some other value < 'C' (but larger than the
previously "gotten" values). This is still the 'current' heap in the KeyValueHeap, and only
during the 'next' would we recheck the priority queue. At this point a different scanner might
become the 'current' (aka top) scanner, but it could NOT have previously been the current
scanner, because it would have had to have a value < 'C', which it did not.

This only works if only 1 KeyValueScanner has an unstable 'peek' and only if it is unstable
in 1 direction (gets smaller than) only.

> Memstore scanner needs new semantics, which may require new data structure
> --------------------------------------------------------------------------
>                 Key: HBASE-3498
>                 URL: https://issues.apache.org/jira/browse/HBASE-3498
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: ryan rawson
>            Assignee: ryan rawson
>             Fix For: 0.92.0
> We may need a new memstore datastructure. Much has been written about the concurrency
and speed and cpu usage, but there are new things that were brought to light with HBASE-2856.

> Specifically we need a memstore scanner that serves up to the moment reads, with a row-level
completeness. Specifically after a memstore scanner goes past the end of a row, it should
return some kind of 'end of row' token which the StoreScanner should trigger on to know it's
at the end of the row. The next call to memstore scanner.next() should return the _very next
available row from the start of that row_ at _the time it's requested_.
> It should specifically NOT:
> - return everything but the first column
> - skip a row that was inserted _after_ the previous next() was completed

This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message