cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Branimir Lambov (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-7019) Improve tombstone compactions
Date Thu, 18 Feb 2016 10:06:18 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-7019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15152067#comment-15152067
] 

Branimir Lambov commented on CASSANDRA-7019:
--------------------------------------------

Uploaded a new version here:
|[code|https://github.com/blambov/cassandra/tree/7019-tryouts-no-deserialization]|[utest|http://cassci.datastax.com/job/blambov-7019-tryouts-no-deserialization-testall/]|[dtest|http://cassci.datastax.com/job/blambov-7019-tryouts-no-deserialization-dtest/]|

Changes:
- The option is changed to an enum with three values: NONE, ROW and CELL, controlling what
level of deletions and overwrites to examine.
- ROW option is as in the previous version, but now uses an implementation of the simple sstable
iterator that only decodes tombstones and row deletions, skipping over row content (for 3.0+
tables only). It also skips sstables that do not have tombstones.
- CELL option also examines row content to find overwritten or deleted cells.
- The partition level deletion is now handled properly -- partially undoing your change --
it is _removed_ if superseded by the one from the tombstone source. The latter is also used
to filter the partition content.

Minor additional changes:
- Data file references of the tombstone tables are now explicitly opened and closed, only
once.
- Fixes bug in {{hashCode}} calculation for {{BTreeRow}}, which was always producing a different
value.
- Fixes unnecessary sorting in finding table for tombstone compaction.
- Adds more tests and fixes test failures.

Performance run results:
{code}
{"provide_overlapping_tombstones":"CELL","class":"org.apache.cassandra.db.compaction.LeveledCompactionStrategy"}
CELL compactions completed in 6.364s
Operations completed in 394.591s, out of which 52.562 for ongoing NONE background compactions
At start:            9 tables    922541625 bytes       876088 rows       423530 deleted rows
       42867 tombstone markers
At end:              9 tables    853445991 bytes       810249 rows       407096 deleted rows
       41779 tombstone markers

{"provide_overlapping_tombstones":"ROW","class":"org.apache.cassandra.db.compaction.LeveledCompactionStrategy"}
ROW compactions completed in 6.577s
Operations completed in 408.181s, out of which 54.373 for ongoing NONE background compactions
At start:            9 tables    922539568 bytes       876088 rows       423530 deleted rows
       42867 tombstone markers
At end:              9 tables    853446320 bytes       810249 rows       407096 deleted rows
       41779 tombstone markers

{"provide_overlapping_tombstones":"NONE","class":"org.apache.cassandra.db.compaction.LeveledCompactionStrategy"}
NONE compactions completed in 6.415s
Operations completed in 402.645s, out of which 53.084 for ongoing NONE background compactions
At start:            9 tables    922534683 bytes       876088 rows       423530 deleted rows
       42867 tombstone markers
At end:              9 tables    922531607 bytes       876088 rows       423530 deleted rows
       42867 tombstone markers

{"max_threshold":"32","min_threshold":"4","provide_overlapping_tombstones":"CELL","class":"org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy"}
CELL compactions completed in 10.119s
Operations completed in 527.998s, out of which 18.164 for ongoing NONE background compactions
At start:           12 tables   1627719240 bytes      1549035 rows       551694 deleted rows
       68948 tombstone markers
At end:             12 tables    853460582 bytes       835123 rows       407096 deleted rows
       51964 tombstone markers

{"max_threshold":"32","min_threshold":"4","provide_overlapping_tombstones":"ROW","class":"org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy"}
ROW compactions completed in 8.299s
Operations completed in 519.072s, out of which 18.572 for ongoing NONE background compactions
At start:           12 tables   1627702075 bytes      1549035 rows       551694 deleted rows
       68948 tombstone markers
At end:             12 tables    879153760 bytes       835123 rows       407096 deleted rows
       51964 tombstone markers

{"max_threshold":"32","min_threshold":"4","provide_overlapping_tombstones":"NONE","class":"org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy"}
NONE compactions completed in 9.465s
Operations completed in 509.603s, out of which 18.052 for ongoing NONE background compactions
At start:           12 tables   1627710033 bytes      1549035 rows       551694 deleted rows
       68948 tombstone markers
At end:             12 tables   1627706918 bytes      1549035 rows       551694 deleted rows
       68948 tombstone markers
{code}
For size tiered ROW does most of the work in much shorter time, but there are certain to be
scenarios where CELL helps more. The run doesn't appear to be long enough to see the effects
for leveled, I'll add validation and start a longer one this evening.

Some of your points still remain:
- I haven't been able to do a cstar_perf test yet. Working on it.
- Single-table compactions still don't have this turned on by default -- need to test and
choose CELL/ROW, also figure out if scrub/upgrade/cleanup etc should be doing it.


> Improve tombstone compactions
> -----------------------------
>
>                 Key: CASSANDRA-7019
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7019
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Compaction
>            Reporter: Marcus Eriksson
>            Assignee: Branimir Lambov
>              Labels: compaction
>             Fix For: 3.x
>
>
> When there are no other compactions to do, we trigger a single-sstable compaction if
there is more than X% droppable tombstones in the sstable.
> In this ticket we should try to include overlapping sstables in those compactions to
be able to actually drop the tombstones. Might only be doable with LCS (with STCS we would
probably end up including all sstables)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message