hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ryan Rawson <ryano...@gmail.com>
Subject Re: Composite key, scan on partial key
Date Wed, 15 Dec 2010 01:35:24 GMT
It isn't too much less efficient, you only select the data you need
to.  The extra filter call out should be operating on well cached
data, and just a few extra comparisons. I dont have concrete
benchmarks, but am just speaking based on my knowledge of the
codebase.  Java is pretty good at dynamic inlining, and the JIT can
work wonders.


On Tue, Dec 14, 2010 at 5:32 PM, Bryan Keller <bryanck@gmail.com> wrote:
> Isn't a filter much less efficient than specifying a range with the Scan object?
> On Dec 14, 2010, at 3:32 PM, Ryan Rawson wrote:
>> Hey,
>> If the order ids are variable, then you will have to use a separator.
>> You then can use a start of 'foo:' and a prefix filter of 'foo:'.
>> The start,end key wont work with variable length in this way.  But the
>> good news is prefix filter is very efficient.
>> Good luck!
>> -ryan
>> On Tue, Dec 14, 2010 at 3:28 PM, Bryan Keller <bryanck@gmail.com> wrote:
>>> I had a question about using a Scan on part of a composite key. Say I have order
line item rows, and the ID is order ID + line item ID. Each ID is a random string. I want
to get all line items for an order with my Scan object.
>>> Setting the startRow on Scan is easy enough, just set it to the order ID and
leave off the line item ID. However, because endRow is exclusive, I need to come up with a
key that is just past the order ID. This would be straightforward if the keys are numeric
(just add one to the order ID), but becomes kind of a kludge when the keys are strings.
>>> Right now I build the keys with a byte separator between the two strings and
set it to 0 when storing. Then when I want to scan, I create the startRow with the Order ID
+ (byte)0, and the endRow with Order ID + (byte)1. Seems like kind of a waste to have that
extra byte just for this purpose, though. Is there a better approach, like specifying the
endRow inclusively?

View raw message