hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alex Baranau <alex.barano...@gmail.com>
Subject Re: How many data versions should I keep in HBase?
Date Tue, 10 Apr 2012 12:54:25 GMT
Compression applies to the files stored on disks. All versions of a column
are stored the same way (HBase doesn't differentiate them at the time of
writing and they are not placed "near" each other in the file). Given that,
yes you are likely to get the same level of compression (compr. ratio) if
you increase the # of versions to store.

May I ask you what is your business case that requires storing multiple
versions, but at the same time you are never going to access them?

Sematext :: http://blog.sematext.com/ :: Solr - Lucene - Hadoop - HBase

On Tue, Apr 10, 2012 at 2:58 AM, Davey Yan <davey.yan@gmail.com> wrote:

> HI,
> In my business case, it is unnecessary to keep more then one version of
> data.
> The application code will never try to get/scan older versions.
> Should I set the MAX_VERSIONS => 1 for every table, instead of the default
> 3 ?
> The hbase book online said: Compression will boost performance by
> reducing the size of StoreFiles and thus reducing I/O.
> (http://hbase.apache.org/book/important_configurations.html)
> I have enabled the SNAPPY compression, ideally i will reduce data to
> 22.2% remaining.
> So if i set the MAX_VERSIONS => 1, i will reduce data to 1/3 remaining
> again?
> Thanks for your time.
> Sincerely,
> --
> Davey Yan

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message