hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Purtell (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-1537) Intra-row scanning
Date Wed, 17 Jun 2009 22:08:07 GMT

    [ https://issues.apache.org/jira/browse/HBASE-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12720934#action_12720934

Andrew Purtell commented on HBASE-1537:

What we did for Stargate scanners is make them iterators over cells and then allow scanners
to specify the number of cells they'd like to have come back in one batch. The internal mechanics
are more complicated for region servers to do this, but I think similar semantics would be
good. How to handle crossing row boundaries presents a couple of options:

- Include row key as well as column and timestamp with each cell value. This is not as expensive
as it might sound if a simple string table encoding is used with a marker or two meaning "use
last given row key" and "use last given column". Either Thrift or pbufs can handle this by
marking row and column keys as optional. 

- Make Result capable of holding more than one row. 

- Return early to the client at row boundary and make it do scanner.next() to start up again
on the next row. 

> Intra-row scanning
> ------------------
>                 Key: HBASE-1537
>                 URL: https://issues.apache.org/jira/browse/HBASE-1537
>             Project: Hadoop HBase
>          Issue Type: New Feature
>            Reporter: Jonathan Gray
>             Fix For: 0.21.0
> To continue scaling numbers of columns or versions in a single row, we need a mechanism
to scan within a row so we can return some columns at a time.  Currently, an entire row must
come back as one piece.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message