hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Niels Basjes (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-11990) Make setting the start and stop row for a specific prefix easier
Date Mon, 22 Sep 2014 21:01:35 GMT

    [ https://issues.apache.org/jira/browse/HBASE-11990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14143812#comment-14143812
] 

Niels Basjes commented on HBASE-11990:
--------------------------------------

The effect is that in almost all cases of this filter in combination with MR you will have
a "large" number of mappers that will be started and stopped that have not seen any data.
I say that this starting of needless tasks can be avoided beforehand by using the original
approach of my patch.

I don't yet see when setting the startRow and stopRow would require changing getSplits. The
way the stopRow value is calculated from the prefix should not hit such an effect.
Can you give me an example where you expect to hit such an edge case?
I'll include it as an additional test.

> Make setting the start and stop row for a specific prefix easier
> ----------------------------------------------------------------
>
>                 Key: HBASE-11990
>                 URL: https://issues.apache.org/jira/browse/HBASE-11990
>             Project: HBase
>          Issue Type: New Feature
>          Components: Client
>            Reporter: Niels Basjes
>         Attachments: 11990v4.txt, HBASE-11990-20140916-v2.patch, HBASE-11990-20140916-v3.patch,
HBASE-11990-20140916-v5.patch, HBASE-11990-20140916-v6.patch, HBASE-11990-20140916.patch,
HBASE-11990-20140917-v7.patch, HBASE-11990-20140919-v8.patch, HBASE-11990-20140921-v9.patch
>
>
> If you want to set a scan from your application to scan for a specific row prefix this
is actually quite hard.
> As described in several places you can set the startRow to the prefix; yet the stopRow
should be set to the prefix '+1'
> If the prefix 'ASCII' put into a byte[] then this is easy because you can simply increment
the last byte of the array. 
> But if your application uses real binary rowids you may run into the scenario that your
prefix is something like 
> {code}{ 0x12, 0x23, 0xFF, 0xFF }{code} Then the increment should be {code}{ 0x12, 0x24
}{code}
> I have prepared a proposed patch that makes setting these values correctly a lot easier.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message