hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Purtell (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-11803) Programming model for reverse scan is confusing
Date Sat, 23 Aug 2014 17:22:11 GMT

    [ https://issues.apache.org/jira/browse/HBASE-11803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14108058#comment-14108058
] 

Andrew Purtell commented on HBASE-11803:
----------------------------------------

bq. You can decrease the last byte by one, but you need to add an indeterminate 0xFF bytes
to ensure you're not including a row unintentionally.

An early version of the reverse scan patch did this and made the number of 0xFF bytes to use
a site configuration option. That's not acceptable. The number of bytes needed is indeterminate
from the client API's perspective. It will vary by application keying strategy. 

bq. Should the creation of ExclusiveStartFilter be done in separate JIRA ?

I don't think that is necessary. We could do something in the context of this issue like:
1. Add ExclusiveStartFilter
2. Add a static helper method in Scan like
{code}
public static Scan makeReversed(Scan scan)
{code}
(or choose a better name) that takes a forward scan and makes all necessary transformations
such as setting the reversed flag, swapping start and end rows, adding the ExclusiveStartFilter.



> Programming model for reverse scan is confusing
> -----------------------------------------------
>
>                 Key: HBASE-11803
>                 URL: https://issues.apache.org/jira/browse/HBASE-11803
>             Project: HBase
>          Issue Type: Bug
>          Components: Client
>    Affects Versions: 0.98.1
>            Reporter: James Taylor
>
> The reverse scan is a very nice feature in HBase. We leverage it in Apache Phoenix 4.1
when possible and see a huge boost in performance over re-ordering the result set ourselves.
> However, the way in which you have to adjust the start/stop key is confusing. Our use
case is that we have a scan that needs to be done and we've calculated an inclusive start
row and an exclusive stop row. This is the way region boundaries are which is convenient as
they can easily be intersected against the scan stop/start row. When we use a reverse scan,
we are forced to switch the start and stop row values of the scan *and* adjust the byte values
from inclusive to exclusive and from exclusive to inclusive. The former is not too bad, as
you can just add a zero byte, but the latter is problematic. You can decrease the last byte
by one, but you need to add an indeterminate 0xFF bytes to ensure you're not including a row
unintentionally.
> IMHO, it would be much cleaner to just keep the start/stop row as is and just set  call
the Scan.setReversed(true) method.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message