Return-Path: Delivered-To: apmail-incubator-cassandra-user-archive@minotaur.apache.org Received: (qmail 63247 invoked from network); 4 Dec 2009 02:07:48 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 4 Dec 2009 02:07:48 -0000 Received: (qmail 21251 invoked by uid 500); 4 Dec 2009 02:07:48 -0000 Delivered-To: apmail-incubator-cassandra-user-archive@incubator.apache.org Received: (qmail 21214 invoked by uid 500); 4 Dec 2009 02:07:48 -0000 Mailing-List: contact cassandra-user-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: cassandra-user@incubator.apache.org Delivered-To: mailing list cassandra-user@incubator.apache.org Received: (qmail 21205 invoked by uid 99); 4 Dec 2009 02:07:48 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 04 Dec 2009 02:07:48 +0000 X-ASF-Spam-Status: No, hits=-2.6 required=5.0 tests=AWL,BAYES_00 X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of rrabah@playdom.com designates 74.125.149.71 as permitted sender) Received: from [74.125.149.71] (HELO na3sys009aog103.obsmtp.com) (74.125.149.71) by apache.org (qpsmtpd/0.29) with SMTP; Fri, 04 Dec 2009 02:07:46 +0000 Received: from source ([209.85.160.58]) by na3sys009aob103.postini.com ([74.125.148.12]) with SMTP ID DSNKSxhu3d3+EoJwSIVvWXBOahUAKhYB7YD7@postini.com; Thu, 03 Dec 2009 18:07:25 PST Received: by pwi16 with SMTP id 16so2339901pwi.37 for ; Thu, 03 Dec 2009 18:07:24 -0800 (PST) MIME-Version: 1.0 Received: by 10.140.169.14 with SMTP id r14mr172714rve.103.1259892444761; Thu, 03 Dec 2009 18:07:24 -0800 (PST) In-Reply-To: References: Date: Thu, 3 Dec 2009 18:07:24 -0800 Message-ID: Subject: Re: Removes increasing disk space usage in Cassandra? From: Ramzi Rabah To: cassandra-user@incubator.apache.org Content-Type: text/plain; charset=ISO-8859-1 Looking at jconsole I see a high number of writes when I do removes, so I am guessing these are tombstones being written? If that's the case, is the data being removed and replaced by tombstones? and will they all be deleted eventually when compaction runs? On Thu, Dec 3, 2009 at 3:18 PM, Ramzi Rabah wrote: > Hi all, > > I ran a test where I inserted about 1.2 Gigabytes worth of data into > each node of a 4 node cluster. > I ran a script that first calls a get on each column inserted followed > by a remove. Since I was basically removing every entry > I inserted before, I expected that the disk space occupied by the > nodes will go down and eventually become 0. The disk space > actually goes up when I do the bulk removes to about 1.8 gigs per > node. Am I missing something here? > > Thanks a lot for your help > Ray >