hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ryan Rawson <ryano...@gmail.com>
Subject Re: Composite key, scan on partial key
Date Wed, 15 Dec 2010 01:35:24 GMT
It isn't too much less efficient, you only select the data you need
to.  The extra filter call out should be operating on well cached
data, and just a few extra comparisons. I dont have concrete
benchmarks, but am just speaking based on my knowledge of the
codebase.  Java is pretty good at dynamic inlining, and the JIT can
work wonders.

-ryan

On Tue, Dec 14, 2010 at 5:32 PM, Bryan Keller <bryanck@gmail.com> wrote:
> Isn't a filter much less efficient than specifying a range with the Scan object?
>
> On Dec 14, 2010, at 3:32 PM, Ryan Rawson wrote:
>
>> Hey,
>>
>> If the order ids are variable, then you will have to use a separator.
>> You then can use a start of 'foo:' and a prefix filter of 'foo:'.
>>
>> The start,end key wont work with variable length in this way.  But the
>> good news is prefix filter is very efficient.
>>
>> Good luck!
>> -ryan
>>
>> On Tue, Dec 14, 2010 at 3:28 PM, Bryan Keller <bryanck@gmail.com> wrote:
>>> I had a question about using a Scan on part of a composite key. Say I have order
line item rows, and the ID is order ID + line item ID. Each ID is a random string. I want
to get all line items for an order with my Scan object.
>>>
>>> Setting the startRow on Scan is easy enough, just set it to the order ID and
leave off the line item ID. However, because endRow is exclusive, I need to come up with a
key that is just past the order ID. This would be straightforward if the keys are numeric
(just add one to the order ID), but becomes kind of a kludge when the keys are strings.
>>>
>>> Right now I build the keys with a byte separator between the two strings and
set it to 0 when storing. Then when I want to scan, I create the startRow with the Order ID
+ (byte)0, and the endRow with Order ID + (byte)1. Seems like kind of a waste to have that
extra byte just for this purpose, though. Is there a better approach, like specifying the
endRow inclusively?
>
>

Mime
View raw message