hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bryan Duxbury <br...@rapleaf.com>
Subject Re: Secondary indexes
Date Tue, 22 Apr 2008 20:10:05 GMT
If you just want the first 10 by a certain prefix ordered by a  
column, then you will definitely be better off scanning them by row  
and ordering them clientside.

Surely your idea of maintaining a separate SortedMap in each region  
would work, but I don't think you should discount the cost associated  
with having to talk to a bunch of different regions every time you  
want to know what the next row is. You'll do a lot of extra work to  
get that merged view of the index, and potentially the approach won't  
scale up to queries that might cover more than a "few" regions - can  
you imagine having to check 100 or 1000 regions for the next result  
every time you needed to iterate?

On Apr 22, 2008, at 12:58 PM, Clint Morgan wrote:

> Yeah, that would be an easy approach. We would need HBASE-82.
>
> The main problem I see here is that we cannot take (as much) advantage
> of our row key prefix in weeding out rows.
>
> Say I want the first 10 rows that start with XXX ordered by A:amount,
> then I would have scan through column values from rows everywhere in
> the table until I hit 10 with my prefix. Could be costly if table is
> large compared to the number of rows that start with XXX.
>
> Whereas if we have one SortedMap per Region, then I can quickly narrow
> down to (hopefully) a few regions based on key prefix.
>
> Though other usage / table loading patterns would surely benefit from
> this approach...
>
> On Tue, Apr 22, 2008 at 12:31 PM, Bryan Duxbury <bryan@rapleaf.com>  
> wrote:
>> This doesn't have to be all that complicated.
>>
>>  Why not keep another HBase table as the index? The keys would be  
>> the column
>> values. There'd be a single matchingRows column family, and the  
>> qualifiers
>> and values would be the rows that match that column. Then, when  
>> you want to
>> scan in column order instead of row order, you scan the index  
>> table, find
>> the list of rows that match each column, and then do a random read  
>> to grab
>> those individually. It'll for sure be slower than scanning a table  
>> ordered
>> by rows, but it'll get you what you want. It'll also handle the  
>> case where
>> the column values aren't unique.
>>
>>  If you need custom sorting for those values, then HBASE-82 would  
>> solve that
>> problem.
>>
>>  -Bryan
>>
>>
>>
>>  On Apr 22, 2008, at 11:58 AM, stack wrote:
>>
>>
>>> Some questions interlaced below:
>>>
>>> Clint Morgan wrote:
>>>
>>>> All,
>>>>
>>>> We want to put secondary indexes into hbase. The motivation is  
>>>> that we
>>>> are storing data in hbase that we want to serve to users. We would
>>>> like to be able to serve rows sorted by column values. Our queries
>>>> will be over rows with a given key prefix, so we should not be  
>>>> hitting
>>>> to many regions.
>>>>  I was thinking it would work roughly like this:
>>>>
>>>> - At table creation time, individual columns can be declared as
>>>> indexed. By default we could sort the column values  
>>>> lexicographically,
>>>> or we can provide a WritableComparatorFactory<T> which has the  
>>>> ability
>>>> to make values of type T from a byte [], as well as providing a
>>>> Comparator<T>. (Better than providing a Comparator<byte[]> as
it  
>>>> only
>>>> costs once per row insert for deserialization, rather that twice on
>>>> each comparison).
>>>>
>>>>
>>>
>>> I don't follow what the Factory adds.
>>>
>>> We're talking about getting HBASE-82 into 0.2.  Does that  
>>> interfere with
>> this proposal?  (I'm thinking that along w/ rows becoming byte  
>> arrays rather
>> Text with a client-supplied Comparator, column qualifiers would  
>> shift to be
>> byte arrays also -- though yeah, implies that if your sort is not
>> byte-lexicographical, yes, the compares can be costly involving two
>> deserializations per compare).
>>>
>>>> - We catch all writes/deletes and maintain a SortedMap<T,  
>>>> HStoreKey>
>>>> which keeps the column values in order, and maps them back to row
>>>> keys. First cut may just keep all this in memory, but it should be
>>>> backed with MapFile(s).
>>>>
>>>>
>>>
>>> Would be sweet if you could leverage the HBase memcache code and  
>>> flusher
>> to do the above.
>>>
>>> This Map would be global for the table?  Or per Region?
>>>
>>> A lucene index wouldn't work for you because you want ordering?
>>>
>>>
>>>> - Add to the hregion the ability to scan through keys in column  
>>>> order.
>>>> Just iterate through the SortedMap, run a filter on the key, and  
>>>> if it
>>>> passes do a get on the row.
>>>>
>>>>
>>>
>>> You'd be random reading rows.  You're OK w/ current performance?   
>>> (For
>> sure it will only improve but....).
>>>
>>>
>>>> - Add a ColumnOrderedClientScanner which will open column order
>>>> scanners to all applicable hregions, and continuously pick row with
>>>> the lowest column value from each of the client scanners.
>>>>
>>>>
>>>
>>> This scanner would have a significant client-side component to do  
>>> the
>> arbitrage between all regions to figure the lowest column value?   
>> If you had
>> a new type of 'region' -- one denoted by lowest and upper column  
>> then the
>> client-side logic would fade away and your scanner would look like  
>> current
>> scanners.
>>>
>>>> - Region splits should be easy enough, just a scan through the
>>>> SortedMap to partition.
>>>>
>>>>
>>>
>>> Splits would not be row-based and run as they currently do, but  
>>> rather
>> sorted-column based?
>>>
>>>
>>>> Of course, the index could also be used for more efficient  
>>>> querying on
>>>> the indexed column's values.
>>>>
>>>> Do other users have a need for this functionality?
>>>>
>>>> What do developers think about this? I know hbase is more  
>>>> intended for
>>>> back-end batch style processing, but we have this need.
>>>>
>>>>
>>>>
>>> How are you thinking of adding in this new functionality?   
>>> Subclassing
>> HRegionServer?
>>>
>>> St.Ack
>>>
>>>
>>>> Cheers,
>>>> -clint
>>>>
>>>>
>>>
>>>
>>
>>


Mime
View raw message