cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anuj Wadehra <anujw_2...@yahoo.co.in>
Subject Re: Drawbacks of Major Compaction now that Automatic Tombstone Compaction Exists
Date Mon, 13 Apr 2015 17:52:03 GMT
Any comments on side effects of Major compaction especially when sstable generated is 100+
GB? 


After Cassandra 1.2 , automated tombstone compaction occurs even on a single sstable if tombstone
percentage increases the tombstone_threshold sub property specified in compaction strategy.
So, even if the huge sstable is not compacted with any new table, still tombstones will be
collected. Any other disadvantage of having a giant sstable of hundreds of GB? I understand
that sstables have a summary and index which helps finding correct data blocks directly from
a large data file. Still are there any disadvantages?


Thanks

Anuj Wadehra


Sent from Yahoo Mail on Android

From:"Anuj Wadehra" <anujw_2003@yahoo.co.in>
Date:Mon, 13 Apr, 2015 at 12:33 am
Subject:Re: Drawbacks of Major Compaction now that Automatic Tombstone Compaction Exists

No.


Anuj Wadehra




On Monday, 13 April 2015 12:23 AM, Sebastian Estevez <sebastian.estevez@datastax.com>
wrote:



Have you tried user defined compactions via JMX?

On Apr 12, 2015 1:40 PM, "Anuj Wadehra" <anujw_2003@yahoo.co.in> wrote:

Recently we faced an issue where every repair operation caused addition of hundreds of sstables
(CASSANDRA-9146). In order to bring situation under control and make sure reads are not impacted,
we were left with no option but to run major compaction to ensure that thousands of tiny sstables
are compacted.

Queries:
Does major compaction has any drawback after automatic tombstone compaction got implemented
in 1.2 via tombstone_threshold sub-property(CASSANDRA-3442)? 
I understand that the huge SSTable created after major compaction wont be compacted with new
data any time soon but is that a problem if purged data is removed via automatic tombstone
compaction? If we major compaction results in a huge file say 500GB, what are the drawbacks
of it?

If one big sstable is a problem, is there any way of solving the problem? We tried running
sstablesplit after major compaction to split the big sstable but as new sstables were of same
size they are again compacted into single huge table once Cassandra was started after executing
sstablesplit.



Thanks

Anuj Wadehra




Mime
View raw message