I don't think setting gc_grace_seconds to an hour is going to do what you'd expect. After gc_grace_seconds, if you haven't run a repair within that hour, the data you deleted will seem to have been undeleted.

Someone correct me if I'm wrong, but in order to order to completely delete data and regain the space it takes up, you need to "delete" it, which creates tombstones, and then run a repair on that column family within gc_grace_seconds. After that the data is actually gone and the space reclaimed.


On Tue, Jul 16, 2013 at 6:20 AM, 杨辉强 <huiqiangyang@yunrang.com> wrote:
Thank you!
It should be "update column family ScheduleInfoCF with gc_grace = 3600;"
Faint.

----- 原始邮件 -----
发件人: "杨辉强" <huiqiangyang@yunrang.com>
收件人: user@cassandra.apache.org
发送时间: 星期二, 2013年 7 月 16日 下午 6:15:12
主题: Re: Deletion use more space.

Hi,
  I use the follow cmd to update gc_grace_seconds. It reports error! Why?

[default@WebSearch] update column family ScheduleInfoCF with gc_grace_seconds = 3600;
java.lang.IllegalArgumentException: No enum const class org.apache.cassandra.cli.CliClient$ColumnFamilyArgument.GC_GRACE_SECONDS


----- 原始邮件 -----
发件人: "Michał Michalski" <michalm@opera.com>
收件人: user@cassandra.apache.org
发送时间: 星期二, 2013年 7 月 16日 下午 5:51:49
主题: Re: Deletion use more space.

Deletion is not really "removing" data, but it's adding tombstones
(markers) of deletion. They'll be later merged with existing data during
compaction and - in the end (see: gc_grace_seconds) - removed, but by
this time they'll take some space.

http://wiki.apache.org/cassandra/DistributedDeletes

M.

W dniu 16.07.2013 11:46, 杨辉强 pisze:
> Hi, all:
>    I use cassandra 1.2.4 and I have 4 nodes ring and use byte order partitioner.
>    I had inserted about 200G data in the ring previous days.
>
>    Today I write a program to scan the ring and then at the same time delete the items that are scanned.
>    To my surprise, the cassandra cost more disk usage.
>
>     Anybody can tell me why? Thanks.
>