hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Davey Yan <davey....@gmail.com>
Subject Re: How many data versions should I keep in HBase?
Date Wed, 11 Apr 2012 01:08:45 GMT
Thank you for your reply, Alex.

In my business case, it is unnecessary to store or access more then
one version of data.
I will set the MAX_VERSIONS => 1 for every table.

On Tue, Apr 10, 2012 at 8:54 PM, Alex Baranau <alex.baranov.v@gmail.com> wrote:
> Compression applies to the files stored on disks. All versions of a column
> are stored the same way (HBase doesn't differentiate them at the time of
> writing and they are not placed "near" each other in the file). Given that,
> yes you are likely to get the same level of compression (compr. ratio) if
> you increase the # of versions to store.
>
> May I ask you what is your business case that requires storing multiple
> versions, but at the same time you are never going to access them?
>
> Alex
> ------
> Sematext :: http://blog.sematext.com/ :: Solr - Lucene - Hadoop - HBase
>
> On Tue, Apr 10, 2012 at 2:58 AM, Davey Yan <davey.yan@gmail.com> wrote:
>
>> HI,
>>
>> In my business case, it is unnecessary to keep more then one version of
>> data.
>> The application code will never try to get/scan older versions.
>>
>> Should I set the MAX_VERSIONS => 1 for every table, instead of the default
>> 3 ?
>>
>> The hbase book online said: Compression will boost performance by
>> reducing the size of StoreFiles and thus reducing I/O.
>> (http://hbase.apache.org/book/important_configurations.html)
>> I have enabled the SNAPPY compression, ideally i will reduce data to
>> 22.2% remaining.
>> So if i set the MAX_VERSIONS => 1, i will reduce data to 1/3 remaining
>> again?
>>
>> Thanks for your time.
>> Sincerely,
>>
>>
>> --
>> Davey Yan
>>



-- 
Davey Yan

Mime
View raw message