incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ramzi Rabah <>
Subject Re: Removes increasing disk space usage in Cassandra?
Date Fri, 04 Dec 2009 18:51:40 GMT
I think there might be a bug in the deletion logic. I removed all the
data on the cluster by running remove on every single key I entered,
and I run major compaction
nodeprobe -host hostname compact on a certain node, and after the
compaction is over, I am left with one data file/ one index file and
the bloom filter file,
and they are the same size of data as before I started doing the deletes.

On Thu, Dec 3, 2009 at 6:09 PM, Jonathan Ellis <> wrote:
> cassandra never modifies data in-place.  so it writes tombstones to
> supress the older writes, and when compaction occurs the data and
> tombstones get GC'd (after the period specified in your config file).
> On Thu, Dec 3, 2009 at 8:07 PM, Ramzi Rabah <> wrote:
>> Looking at jconsole I see a high number of writes when I do removes,
>> so I am guessing these are tombstones being written? If that's the
>> case, is the data being removed and replaced by tombstones? and will
>> they all be deleted eventually when compaction runs?
>> On Thu, Dec 3, 2009 at 3:18 PM, Ramzi Rabah <> wrote:
>>> Hi all,
>>> I ran a test where I inserted about 1.2 Gigabytes worth of data into
>>> each node of a 4 node cluster.
>>> I ran a script that first calls a get on each column inserted followed
>>> by a remove. Since I was basically removing every entry
>>> I inserted before, I expected that the disk space occupied by the
>>> nodes will go down and eventually become 0. The disk space
>>> actually goes up when I do the bulk removes to about 1.8 gigs per
>>> node. Am I missing something here?
>>> Thanks a lot for your help
>>> Ray

View raw message