Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 81674 invoked from network); 27 Jul 2010 17:27:20 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 27 Jul 2010 17:27:20 -0000 Received: (qmail 74993 invoked by uid 500); 27 Jul 2010 17:27:19 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 74927 invoked by uid 500); 27 Jul 2010 17:27:18 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 74919 invoked by uid 99); 27 Jul 2010 17:27:18 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 27 Jul 2010 17:27:18 +0000 X-ASF-Spam-Status: No, hits=0.7 required=10.0 tests=SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [209.85.161.44] (HELO mail-fx0-f44.google.com) (209.85.161.44) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 27 Jul 2010 17:27:09 +0000 Received: by fxm1 with SMTP id 1so856713fxm.31 for ; Tue, 27 Jul 2010 10:26:48 -0700 (PDT) MIME-Version: 1.0 Received: by 10.103.252.15 with SMTP id e15mr1181919mus.67.1280251608096; Tue, 27 Jul 2010 10:26:48 -0700 (PDT) Sender: scode@scode.org Received: by 10.103.240.3 with HTTP; Tue, 27 Jul 2010 10:26:48 -0700 (PDT) X-Originating-IP: [213.114.156.79] In-Reply-To: References: Date: Tue, 27 Jul 2010 19:26:48 +0200 X-Google-Sender-Auth: e67VMPtc-B5uiCbdZzaY46lkHcU Message-ID: Subject: Re: Cassandra disk space utilization WAY higher than I would expect From: Peter Schuller To: user@cassandra.apache.org Content-Type: text/plain; charset=UTF-8 X-Virus-Checked: Checked by ClamAV on apache.org > Minor compactions (see > http://wiki.apache.org/cassandra/MemtableSSTable) will try to keep the > growth in check but it is by no means limited to 2x. Sorry I was being unclear. I was rather thinking along the lines of a doubling of data triggering an implicit major compaction. However I was wrong anyway, since minimumCompactionThreshold in CompactionManager is set to 4. This does make me realize that the actual worst-case spike of disk space usage is decidedly non-trivial to figure out, even if we are allowed to assume that compaction speed is equal to or greater than the speed of writes. -- / Peter Schuller