hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sagar Naik <sn...@splunk.com>
Subject Re: HBase Design : Column name v/s Version
Date Fri, 24 Jan 2014 19:36:16 GMT
I do not have to purge the data.
I always need all the versions.

But Dhaval, raised a valid point of 100K versions and no pagination
support based on versions.

-Sagar

On 1/24/14 11:23 AM, "Vladimir Rodionov" <vrodionov@carrieriq.com> wrote:

>One downside of using synthetic versions is you won't be able to use TTL,
>which gives you automatic purge of stale data for free
>Have you thought already how to purge old data?
>
>Best regards,
>Vladimir Rodionov
>Principal Platform Engineer
>Carrier IQ, www.carrieriq.com
>e-mail: vrodionov@carrieriq.com
>
>________________________________________
>From: Sagar Naik [snaik@splunk.com]
>Sent: Friday, January 24, 2014 10:46 AM
>To: user@hbase.apache.org; Dhaval Shah
>Subject: Re: HBase Design : Column name v/s Version
>
>Thanks for clarifying,
>
>I will be using custom version numbers (auto incrementing on the client
>side) and not timestamps.
>Two clients do not update the same row
>
>
>-Sagar
>
>On 1/24/14 10:33 AM, "Dhaval Shah" <prince_mithibai@yahoo.co.in> wrote:
>
>>I am talking about schema 2. Schema 1 would definitely work. Schema 2 can
>>have the version collisions if you decide to use timestamps as versions
>>
>>Regards,
>>
>>Dhaval
>>
>>
>>----- Original Message -----
>>From: Sagar Naik <snaik@splunk.com>
>>To: "user@hbase.apache.org" <user@hbase.apache.org>; Dhaval Shah
>><prince_mithibai@yahoo.co.in>
>>Cc:
>>Sent: Friday, 24 January 2014 1:07 PM
>>Subject: Re: HBase Design : Column name v/s Version
>>
>>I am not sure I understand you correctly.
>>I assume you are talking abt schema 1.
>>In this case I m appending the version number to the column name.
>>
>>The column_names are different (data_1/data_2) for value_1 and value_2
>>respectively.
>>
>>
>>-Sagar
>>
>>
>>On 1/24/14 9:47 AM, "Dhaval Shah" <prince_mithibai@yahoo.co.in> wrote:
>>
>>>Versions in HBase are timestamps by default. If you intend to continue
>>>using the timestamps, what will happen when someone writes value_1 and
>>>value_2 at the exact same time?
>>>
>>>Regards,
>>>
>>>Dhaval
>>>
>>>
>>>----- Original Message -----
>>>From: Sagar Naik <snaik@splunk.com>
>>>To: "user@hbase.apache.org" <user@hbase.apache.org>
>>>Cc:
>>>Sent: Friday, 24 January 2014 12:27 PM
>>>Subject: HBase Design : Column name v/s Version
>>>
>>>Hi,
>>>
>>>I have a choice to maintain to data either in column values or as
>>>versioned data.
>>>This data is not a versioned copy per se.
>>>
>>>The access pattern on this get all the data every time
>>>
>>>So the schema choices are :
>>>Schema 1:
>>>1. column_name/qualifier => data_1. column_value => value_1
>>>1.a. column_name/qualifier => data_2. column_value => value_2,value_2.a
>>>
>>>1.b. column_name/qualifier => data_3. column_value => value_3
>>>
>>>To get all the values for "data", I will have to use ColumnPrefixFilter
>>>with prefix set "data"
>>>
>>>Schema 2:
>>>2. column_name/qualifier => data. version=> 1, column_value => value_1
>>>
>>>2.a. column_name/qualifier => data. version=> 2, column_value =>
>>>value_2,value_2.a
>>>
>>>2.b. column_name/qualifier => data. version=> 3, column_value => value_3
>>>To get all the values for "data" , I will do a simple get operation to
>>>get
>>>all the versions.
>>>
>>>Number of versions can go from: 10 to 100K
>>>
>>>Get operation perf should beat the Filter perf.
>>>Comparing 100K values will be costly as the # versions increase.
>>>
>>>I would like to know if there are drawbacks in going the version route.
>>>
>>>
>>>
>>>
>>>-Sagar
>>>
>>
>
>
>Confidentiality Notice:  The information contained in this message,
>including any attachments hereto, may be confidential and is intended to
>be read only by the individual or entity to whom this message is
>addressed. If the reader of this message is not the intended recipient or
>an agent or designee of the intended recipient, please note that any
>review, use, disclosure or distribution of this message or its
>attachments, in any form, is strictly prohibited.  If you have received
>this message in error, please immediately notify the sender and/or
>Notifications@carrieriq.com and delete or destroy any copy of this
>message and its attachments.


Mime
View raw message