cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Marcus Eriksson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-11179) Parallel cleanup can lead to disk space exhaustion
Date Fri, 11 Mar 2016 15:30:03 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-11179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15191057#comment-15191057
] 

Marcus Eriksson commented on CASSANDRA-11179:
---------------------------------------------

Been testing this a bit and I don't think we have any problem with cleanup not removing sstables
during the operation

I ran this: https://github.com/krummas/cassandra-dtest/commits/monitor (I will convert to
proper dtest)
and got this output on 2.1:
{code}
/tmp/dtest-XWN_pU/test/node1/data0/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-5-Data.db
/tmp/dtest-XWN_pU/test/node1/data0/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-2-Data.db
/tmp/dtest-XWN_pU/test/node1/data1/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-1-Data.db
/tmp/dtest-XWN_pU/test/node1/data2/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-4-Data.db
/tmp/dtest-XWN_pU/test/node1/data2/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-3-Data.db
----------------
/tmp/dtest-XWN_pU/test/node1/data0/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-6-Data.db
/tmp/dtest-XWN_pU/test/node1/data0/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-5-Data.db
/tmp/dtest-XWN_pU/test/node1/data1/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-1-Data.db
/tmp/dtest-XWN_pU/test/node1/data2/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-4-Data.db
/tmp/dtest-XWN_pU/test/node1/data2/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-3-Data.db
/tmp/dtest-XWN_pU/test/node1/data2/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-tmp-ka-7-Data.db
----------------
/tmp/dtest-XWN_pU/test/node1/data0/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-6-Data.db
/tmp/dtest-XWN_pU/test/node1/data0/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-5-Data.db
/tmp/dtest-XWN_pU/test/node1/data1/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-1-Data.db
/tmp/dtest-XWN_pU/test/node1/data2/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-7-Data.db
/tmp/dtest-XWN_pU/test/node1/data2/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-tmp-ka-8-Data.db
/tmp/dtest-XWN_pU/test/node1/data2/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-3-Data.db
----------------
/tmp/dtest-XWN_pU/test/node1/data0/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-6-Data.db
/tmp/dtest-XWN_pU/test/node1/data0/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-5-Data.db
/tmp/dtest-XWN_pU/test/node1/data0/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-tmp-ka-9-Data.db
/tmp/dtest-XWN_pU/test/node1/data1/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-1-Data.db
/tmp/dtest-XWN_pU/test/node1/data2/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-8-Data.db
/tmp/dtest-XWN_pU/test/node1/data2/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-7-Data.db
----------------
/tmp/dtest-XWN_pU/test/node1/data0/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-6-Data.db
/tmp/dtest-XWN_pU/test/node1/data0/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-5-Data.db
/tmp/dtest-XWN_pU/test/node1/data0/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-tmp-ka-9-Data.db
/tmp/dtest-XWN_pU/test/node1/data1/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-1-Data.db
/tmp/dtest-XWN_pU/test/node1/data2/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-8-Data.db
/tmp/dtest-XWN_pU/test/node1/data2/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-7-Data.db
----------------
/tmp/dtest-XWN_pU/test/node1/data0/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-6-Data.db
/tmp/dtest-XWN_pU/test/node1/data0/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-9-Data.db
/tmp/dtest-XWN_pU/test/node1/data0/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-5-Data.db
/tmp/dtest-XWN_pU/test/node1/data1/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-tmp-ka-10-Data.db
/tmp/dtest-XWN_pU/test/node1/data2/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-8-Data.db
/tmp/dtest-XWN_pU/test/node1/data2/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-7-Data.db
----------------
/tmp/dtest-XWN_pU/test/node1/data0/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-6-Data.db
/tmp/dtest-XWN_pU/test/node1/data0/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-9-Data.db
/tmp/dtest-XWN_pU/test/node1/data1/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-10-Data.db
/tmp/dtest-XWN_pU/test/node1/data2/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-8-Data.db
/tmp/dtest-XWN_pU/test/node1/data2/keyspace1/standard1-ea6af260e79c11e5bc1783123b779c82/keyspace1-standard1-ka-7-Data.db
----------------
{code}
That is, only writing to a single file with a single compactor, and the old file is gone once
the -tmp- file disappears.

On 3.0 I get this:
{code}
/tmp/dtest-50KYOT/test/node1/data0/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-2-big-Data.db
/tmp/dtest-50KYOT/test/node1/data1/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-5-big-Data.db
/tmp/dtest-50KYOT/test/node1/data1/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-3-big-Data.db
/tmp/dtest-50KYOT/test/node1/data2/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-4-big-Data.db
/tmp/dtest-50KYOT/test/node1/data2/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-1-big-Data.db
----------------
/tmp/dtest-50KYOT/test/node1/data0/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-2-big-Data.db
/tmp/dtest-50KYOT/test/node1/data1/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-5-big-Data.db
/tmp/dtest-50KYOT/test/node1/data1/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-3-big-Data.db
/tmp/dtest-50KYOT/test/node1/data2/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-6-big-Data.db
/tmp/dtest-50KYOT/test/node1/data2/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-4-big-Data.db
/tmp/dtest-50KYOT/test/node1/data2/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-1-big-Data.db
----------------
/tmp/dtest-50KYOT/test/node1/data0/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-8-big-Data.db
/tmp/dtest-50KYOT/test/node1/data0/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-2-big-Data.db
/tmp/dtest-50KYOT/test/node1/data1/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-3-big-Data.db
/tmp/dtest-50KYOT/test/node1/data2/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-6-big-Data.db
/tmp/dtest-50KYOT/test/node1/data2/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-7-big-Data.db
/tmp/dtest-50KYOT/test/node1/data2/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-1-big-Data.db
----------------
/tmp/dtest-50KYOT/test/node1/data0/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-8-big-Data.db
/tmp/dtest-50KYOT/test/node1/data0/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-2-big-Data.db
/tmp/dtest-50KYOT/test/node1/data1/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-9-big-Data.db
/tmp/dtest-50KYOT/test/node1/data1/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-3-big-Data.db
/tmp/dtest-50KYOT/test/node1/data2/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-6-big-Data.db
/tmp/dtest-50KYOT/test/node1/data2/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-7-big-Data.db
----------------
/tmp/dtest-50KYOT/test/node1/data0/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-8-big-Data.db
/tmp/dtest-50KYOT/test/node1/data1/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-9-big-Data.db
/tmp/dtest-50KYOT/test/node1/data1/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-3-big-Data.db
/tmp/dtest-50KYOT/test/node1/data2/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-6-big-Data.db
/tmp/dtest-50KYOT/test/node1/data2/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-10-big-Data.db
/tmp/dtest-50KYOT/test/node1/data2/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-7-big-Data.db
----------------
/tmp/dtest-50KYOT/test/node1/data0/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-8-big-Data.db
/tmp/dtest-50KYOT/test/node1/data1/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-9-big-Data.db
/tmp/dtest-50KYOT/test/node1/data2/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-6-big-Data.db
/tmp/dtest-50KYOT/test/node1/data2/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-10-big-Data.db
/tmp/dtest-50KYOT/test/node1/data2/keyspace1/standard1-5dfb68a0e79c11e58938d7db499c29d8/ma-7-big-Data.db
----------------
{code}
Filecount never goes above {{#original_files + 1}} with one compactor.

So, this issue is probably down to the fact that people might have 8 concurrent compactors
and then we will quickly use more diskspace.

> Parallel cleanup can lead to disk space exhaustion
> --------------------------------------------------
>
>                 Key: CASSANDRA-11179
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-11179
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Compaction, Tools
>            Reporter: Tyler Hobbs
>            Assignee: T Jake Luciani
>             Fix For: 3.0.x
>
>
> In CASSANDRA-5547, we made cleanup (among other things) run in parallel across multiple
sstables.  There have been reports on IRC of this leading to disk space exhaustion, because
multiple sstables are (almost entirely) rewritten at the same time.  This seems particularly
problematic because cleanup is frequently run after a cluster is expanded due to low disk
space.
> I'm not really familiar with how we perform free disk space checks now, but it sounds
like we can make some improvements here.  It would be good to reduce the concurrency of cleanup
operations if there isn't enough free disk space to support this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message