hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Elser <els...@apache.org>
Subject Re: questions regarding hbase major compaction
Date Tue, 11 Sep 2018 01:35:26 GMT
1. Yes
2. HDFS NN pressure, read slow down, general poor performance
3. Default configuration is weekly, if you don't explicitly know some 
reasons why weekly doesn't work, this is what you should follow ;)
4. No

I would be surprised if you need to do anything special with S3, but I 
don't know for sure.

On 9/10/18 2:19 PM, Antonio Si wrote:
> Hello,
> 
> As I understand, the deleted records in hbase files do not get removed
> until a major compaction is performed.
> 
> I have a few questions regarding major compaction:
> 
> 1.   If I set a TTL and/or a max number of versions, the records are older
> than the TTL or the
>        expired versions will still be in the hbase files until the major
> compaction is performed?
>        Is my understanding correct?
> 
> 2.   If a major compaction is never performed on a table, besides the size
> of the table keep
>        increasing, eventually, we will have too many hbase files and the
> cluster will slow down.
>        Is there any other implications?
> 
> 3.   Is there any guidelines about how often should we run major compaction?
> 
> 4.   During major compaction, do we need to pause all read/write operations
> until major
>        compaction is finished?
> 
>        I realize that if using S3 as the storage, after I run major
> compaction, there is inconsistencies
>        between s3 metadata and s3 file system and I need to run a "emrfs
> sync" to synchronize them
>        after major compaction is completed. Does it mean I need to pause all
> read/write operations
>        during this period?
> 
> Thanks.
> 
> Antonio.
> 

Mime
View raw message