hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Segel <michael_se...@hotmail.com>
Subject Re: hbase key design to efficient query on base of 2 or more column
Date Mon, 19 May 2014 09:41:17 GMT
The point is that choosing a field that has a small finite set of values is not a good candidate
for indexing using an inverted table or b-tree etc … 

I’d say that you’re actually going to be better off using a scan with a start and stop
row, then doing the counts on the client side. 

So as you get back your result set… you process the data. (Either in a M/R job or single
client thread.) 

HTH

On May 19, 2014, at 8:48 AM, Shushant Arora <shushantarora09@gmail.com> wrote:

> I cannot apply server side filter.
> 2nd requirement is not just get users with supreme category rather
> distribution of users category wise.
> 
> 1.How many of supreme , how many of normal and how many of medium till date.
> 
> 
> On Mon, May 19, 2014 at 12:58 PM, Michael Segel
> <michael_segel@hotmail.com>wrote:
> 
>> Whoa!
>> 
>> BAD BOY. This isn’t a good idea for secondary index.
>> 
>> You have a row key (primary index) which is time.
>> The secondary is a filter… with 3 choices.
>> 
>> HINT: Do you really want a secondary index based on a field that only has
>> 3 choices for a value?
>> 
>> What are they teaching in school these days?
>> 
>> How about applying a server side filter?  ;-)
>> 
>> 
>> 
>> On May 18, 2014, at 12:33 PM, John Hancock <jhancock1975@gmail.com> wrote:
>> 
>>> Shushant,
>>> 
>>> Here's one idea, there might be better ways.
>>> 
>>> Take a look at phoenix it supports secondary indexing:
>>> http://phoenix.incubator.apache.org/secondary_indexing.html
>>> 
>>> -John
>>> 
>>> 
>>> On Sat, May 17, 2014 at 8:34 AM, Shushant Arora
>>> <shushantarora09@gmail.com>wrote:
>>> 
>>>> Hi
>>>> 
>>>> I have a requirement to query my data base on date and user category.
>>>> User category can be Supreme,Normal,Medium.
>>>> 
>>>> I want to query how many new users are there in my table from date range
>>>> (2014-01-01) to (2014-05-16) category wise.
>>>> 
>>>> Another requirement is to query how many users of Supreme category are
>>>> there in my table Broken down wise month in which they came.
>>>> 
>>>> What should be my key
>>>> 1.If i take key as combination of date#category. I cannot query based on
>>>> category?
>>>> 2.If I take key as category#date I cannot query based on date.
>>>> 
>>>> 
>>>> Thanks
>>>> Shushant.
>>>> 
>> 
>> 


Mime
View raw message