cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Marcus Eriksson (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (CASSANDRA-5745) Minor compaction tombstone-removal deadlock
Date Thu, 11 Jul 2013 09:43:49 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-5745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13705637#comment-13705637
] 

Marcus Eriksson edited comment on CASSANDRA-5745 at 7/11/13 9:43 AM:
---------------------------------------------------------------------

would be nice to have something full-compaction-like on LCS, someone mentioned google leveldb
has the push-everything-to-the-highest-level lever

I see two ways of doing this;

# include all (possibly several thousand) sstables in a single compaction to the highest level,
could be dangerous since we might OOM (see the comment around MAX_COMPACTING_L0 in LeveledManifest
for example).
# compact smaller chunks of sstables into the highest level. To avoid recompacting the highest
level many times i guess we could exclude L0 and pick _one_ L1 sstable and compact with all
overlapping higher-level sstables. Unsure how to handle L0 though, sstables in L0 should in
theory be few and short-lived so it might not be a big problem (and the data in L0 should
almost always be newer (except for after repair) so we could drop tombstones in L1+ anyway).

your first suggestion sounds interesting, even though i guess that for every flushed sstable,
we would essentially recompact all data? The newly flushed data would quickly trickle up through
the empty levels and then get compacted with all the max-level sstables.
                
      was (Author: krummas):
    would be nice to have something full-compaction-like on LCS, someone mentioned google
leveldb has the push-everything-to-the-highest-level lever

I see two ways of doing this;

# include all (possibly several thousand) sstables in a single compaction to the highest level,
could be dangerous since we might OOM (see the comment around MAX_COMPACTING_L0 in LeveledManifest
for example).
# compact smaller chunks of sstables into the highest level. To avoid recompacting the highest
level many times i guess we could exclude L0 and pick _one_ L1 sstable and compact with all
overlapping higher-level sstables. Unsure how to handle L0 though, sstables in L0 should in
theory be few and short-lived so it might not be a big problem.

your first suggestion sounds interesting, even though i guess that for every flushed sstable,
we would essentially recompact all data? The newly flushed data would quickly trickle up through
the empty levels and then get compacted with all the max-level sstables.
                  
> Minor compaction tombstone-removal deadlock
> -------------------------------------------
>
>                 Key: CASSANDRA-5745
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5745
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Jonathan Ellis
>             Fix For: 2.0.1
>
>
> From a discussion with Axel Liljencrantz,
> If you have two SSTables that have temporally overlapping data, you can get lodged into
a state where a compaction of SSTable A can't drop tombstones because SSTable B contains older
data *and vice versa*. Once that's happened, Cassandra should be wedged into a state where
CASSANDRA-4671 no longer helps with tombstone removal. The only way to break the wedge would
be to perform a compaction containing both SSTable A and SSTable B. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message