hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "James Taylor (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-11803) Programming model for reverse scan is confusing
Date Sat, 23 Aug 2014 18:02:12 GMT

    [ https://issues.apache.org/jira/browse/HBASE-11803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14108074#comment-14108074

James Taylor commented on HBASE-11803:

Thanks for all the ideas, feedback, and workarounds everyone - much appreciated. 

bq. The number of bytes needed is indeterminate from the client API's perspective. It will
vary by application keying strategy.
This is a very good point. I _think_ I've reasoned out that from a Phoenix POV, adding a single
0xFF byte is sufficient.

bq. We could do something in the context of this issue like add a static helper method in
Scan that makes all the necessary transformations
>From an API POV, I think this would be an improvement. Phoenix will likely stick with
what it's doing now for a couple of reasons: 1) we wouldn't want to introduce a runtime dependency
on a later 0.98 HBase version for an issue we've already worked around, and 2) I'd worry that
there's unnecessary overhead in adding Filters (unnecessary in that if I can reason out how
many 0xFF bytes to add to prevent any issues).

My reason for filing the JIRA is more around just giving my two cents on where I think HBase
APIs can be improved. Phoenix hides all the complexity and nuances of using the HBase API
by providing a well understood SQL API on top of it (that's part of it's value). Please take/leave
my feedback as you see fit.

Ideally, it'd be nice if HBase had a KeyRange class that includes: byte[] lowerRange, boolean
lowerInclusive, byte[] upperRange, boolean upperInclusive. Then Scan would contain a KeyRange.
I realize this is likely infeasible to change in HBase at the point, though.  Maybe in 2.0?

> Programming model for reverse scan is confusing
> -----------------------------------------------
>                 Key: HBASE-11803
>                 URL: https://issues.apache.org/jira/browse/HBASE-11803
>             Project: HBase
>          Issue Type: Bug
>          Components: Client
>    Affects Versions: 0.98.1
>            Reporter: James Taylor
>            Assignee: Ted Yu
> The reverse scan is a very nice feature in HBase. We leverage it in Apache Phoenix 4.1
when possible and see a huge boost in performance over re-ordering the result set ourselves.
> However, the way in which you have to adjust the start/stop key is confusing. Our use
case is that we have a scan that needs to be done and we've calculated an inclusive start
row and an exclusive stop row. This is the way region boundaries are which is convenient as
they can easily be intersected against the scan stop/start row. When we use a reverse scan,
we are forced to switch the start and stop row values of the scan *and* adjust the byte values
from inclusive to exclusive and from exclusive to inclusive. The former is not too bad, as
you can just add a zero byte, but the latter is problematic. You can decrease the last byte
by one, but you need to add an indeterminate 0xFF bytes to ensure you're not including a row
> IMHO, it would be much cleaner to just keep the start/stop row as is and just set  call
the Scan.setReversed(true) method.

This message was sent by Atlassian JIRA

View raw message