hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ryan rawson (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-3498) Memstore scanner needs new semantics, which may require new data structure
Date Wed, 02 Feb 2011 22:53:29 GMT

    [ https://issues.apache.org/jira/browse/HBASE-3498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12989847#comment-12989847

ryan rawson commented on HBASE-3498:

one of the problems with the memstore scanner is peeking is _destructive_. Meaning when you
'peek' you are returning what the internal iterator is pointing to and when you call next()
you return THAT and then move the iterator forward. Meaning the iterator pointer is always
pointing to the peek()able value, rather than the 'current' value. 

If the memstore scanner was not destructive, we'd have a situation when the scanner stack
called peek() we'd be looking at the NEXT row, but the iterator would be pointing to whatever
is 'now'.

There is one problem with this, and that is the KeyValueHeap doesn't like it when peek() changes.
 Since the value of MemStoreScanner.peek() would change depending on if someone else inserted
rows, this causes major problems in KVH, in now we get stuff out of order.

> Memstore scanner needs new semantics, which may require new data structure
> --------------------------------------------------------------------------
>                 Key: HBASE-3498
>                 URL: https://issues.apache.org/jira/browse/HBASE-3498
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: ryan rawson
>            Assignee: ryan rawson
>             Fix For: 0.92.0
> We may need a new memstore datastructure. Much has been written about the concurrency
and speed and cpu usage, but there are new things that were brought to light with HBASE-2856.

> Specifically we need a memstore scanner that serves up to the moment reads, with a row-level
completeness. Specifically after a memstore scanner goes past the end of a row, it should
return some kind of 'end of row' token which the StoreScanner should trigger on to know it's
at the end of the row. The next call to memstore scanner.next() should return the _very next
available row from the start of that row_ at _the time it's requested_.
> It should specifically NOT:
> - return everything but the first column
> - skip a row that was inserted _after_ the previous next() was completed

This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message