hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Segel <michael_se...@hotmail.com>
Subject Re: issue about rowkey design
Date Mon, 19 Aug 2013 17:15:59 GMT
Multiple random seeks? 

Sorry, you've lost me. 

In simple design, you use an inverted table where the indexed value is the row key  and the
columns contain the base table's row key. 

One get() and you have all of the rows in the base table that match the key. 
The only gotcha… is if your row exceeds the size of a region. 
To get around this, you could write a function to periodically split the rows and to 
then keep the rows in order so that your keys are always in sort order. Then your get becomes
a start and stop scan where you know the start row and end row to get all of the matching
rows in your base table. 

This would be an efficient way to get rows based on a secondary index, however… you're really
going to want to be careful on how you use it. 

On Aug 18, 2013, at 9:21 PM, Vladimir Rodionov <vrodionov@carrieriq.com> wrote:

> Secondary index requires multiple random seeks and is not efficient in many cases.
> What you need is different row_keys (one for each request type)
> user_id, session_id, visit_time =>
> rowkey1 => "q1", visit_time, user_id
> rowkey2 => "q2", visit_time, session_id
> rowkey3 => "q3", user_id, session_id : ts = visit_time
> Best regards,
> Vladimir Rodionov
> Principal Platform Engineer
> Carrier IQ, www.carrieriq.com
> e-mail: vrodionov@carrieriq.com
> ________________________________________
> From: fgaule@despegar.com [fgaule@despegar.com]
> Sent: Sunday, August 18, 2013 6:25 PM
> To: user@hbase.apache.org; Kiru Pakkirisamy
> Subject: Re: issue about rowkey design
> You can use a secondary table as a 'secondary index' setting your row as value (or column)
in it.
> Enviado desde mi BlackBerry de Personal (http://www.personal.com.ar/)
> -----Original Message-----
> From: ch huang <justlooks@gmail.com>
> Date: Mon, 19 Aug 2013 09:05:19
> To: <user@hbase.apache.org>; Kiru Pakkirisamy<kirupakkirisamy@yahoo.com>
> Reply-To: user@hbase.apache.org
> Subject: Re: issue about rowkey design
> what you mean secondary index? has hbase secondary index?
> On Sat, Aug 17, 2013 at 12:48 AM, Kiru Pakkirisamy <
> kirupakkirisamy@yahoo.com> wrote:
>> We did design with something equivalent to userid as the key and all the
>> user sessions in there.
>> But when we tried to look for particular user sessions within a time
>> range, we found the ColumnPrefixFilter (say on the timerange) did not give
>> us much performance.
>> So we ended up creating another table with time-range as key and all the
>> user sessions ids in it (equivalent).
>> I am pretty much repeating Bryan, but if you just use the ids, you do not
>> duplicate that much data (called secondary index ?)
>> Regards,
>> - kiru
>> Kiru Pakkirisamy | webcloudtech.wordpress.com
>> ________________________________
>> From: Bryan Beaudreault <bbeaudreault@hubspot.com>
>> To: user@hbase.apache.org
>> Sent: Friday, August 16, 2013 8:06 AM
>> Subject: Re: issue about rowkey design
>> HBase is all about denormalization and designing for the usecase/query
>> pattern.   If it's possible for your application it will be better to
>> provide three different indexes, as opposed to fitting them all into one
>> rowkey design.
>> On Fri, Aug 16, 2013 at 5:33 AM, ch huang <justlooks@gmail.com> wrote:
>>> hi,all
>>>     i have data (data  is very huge) with user id ,session id ,and visit
>>> time. my query pattern is ,"find all user id in certain time range,find
>> one
>>> user's all session id ,and find all session id in certain time range".
>>>   my difficult is that i can not find a rowkey that good for all the
>>> search pattern, i wonder if i need set three rowkey for these search
>>> patterns,it's say i need triple my data storage ,any good idea?
> Confidentiality Notice:  The information contained in this message, including any attachments
hereto, may be confidential and is intended to be read only by the individual or entity to
whom this message is addressed. If the reader of this message is not the intended recipient
or an agent or designee of the intended recipient, please note that any review, use, disclosure
or distribution of this message or its attachments, in any form, is strictly prohibited. 
If you have received this message in error, please immediately notify the sender and/or Notifications@carrieriq.com
and delete or destroy any copy of this message and its attachments.

View raw message