cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Carl Yeksigian (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-11179) Parallel cleanup can lead to disk space exhaustion
Date Wed, 23 Mar 2016 17:48:25 GMT


Carl Yeksigian commented on CASSANDRA-11179:

Looks good. Just a couple of comments:
- Would be nice to add a comment to {{parallelAllSSTableOperation}} explaining that jobs =
0 means using all compactor threads, so that we remember to propagate that to our argument
- Also, it's not clear what would happen if you specified a jobs higher than the number of
concurrent compactors. The expectation is probably that it would override that selection,
so either a warning or the inability to do that would be helpful.

> Parallel cleanup can lead to disk space exhaustion
> --------------------------------------------------
>                 Key: CASSANDRA-11179
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Compaction, Tools
>            Reporter: Tyler Hobbs
>            Assignee: Marcus Eriksson
>              Labels: doc-impacting
>             Fix For: 2.1.x, 2.2.x, 3.0.x, 3.x
> In CASSANDRA-5547, we made cleanup (among other things) run in parallel across multiple
sstables.  There have been reports on IRC of this leading to disk space exhaustion, because
multiple sstables are (almost entirely) rewritten at the same time.  This seems particularly
problematic because cleanup is frequently run after a cluster is expanded due to low disk
> I'm not really familiar with how we perform free disk space checks now, but it sounds
like we can make some improvements here.  It would be good to reduce the concurrency of cleanup
operations if there isn't enough free disk space to support this.

This message was sent by Atlassian JIRA

View raw message