incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From 杨辉强 <huiqiangy...@yunrang.com>
Subject Re: Deletion use more space.
Date Wed, 17 Jul 2013 01:52:16 GMT
Thanks, But Michael's answer confuse me more. 

I use list cf; in cassandra-cli. It seems lots of rows have been deleted, but keys exist.

After the deletion, why the key still exists? It seems useless.

RowKey: 3030303031306365633862356437636365303861303433343137656531306435
-------------------
RowKey: 3030303031316333616336366531613636373735396363323037396331613230
-------------------
RowKey: 3030303031316333616336366531613637303964616364363630663865313433
-------------------
RowKey: 3030303031323934613637303239323563633133303238626330646666626335
-------------------
RowKey: 3030303031323934613637303239323566303733303638373138366334323436
-------------------
RowKey: 3030303031333838333139303930633664643364613331316664363134656639
-------------------
RowKey: 3030303031336265343639303630613938376333366230363439316336333230
-------------------
RowKey: 3030303031336365653735376465616334633932333363363832653130363733
-------------------
RowKey: 3030303031343632343261363966376464656235373266663761633233353065


----- 原始邮件 -----
发件人: "Michael Theroux" <mtheroux2@yahoo.com>
收件人: user@cassandra.apache.org
发送时间: 星期二, 2013年 7 月 16日 下午 10:23:32
主题: Re: Deletion use more space.

The only time information is removed from the filesystem is during compaction.  Compaction
can remove tombstones after gc_grace_seconds, which, could result in reanimation of deleted
data if the tombstone was never properly replicated to other replicas.  Repair will make sure
tombstones are consistent amongst replicas.  However, tombstones can not be removed if the
data the tombstone is deleting is in another SSTable and has not yet been removed. 

Hope this helps,
-Mike

  
On Jul 16, 2013, at 10:04 AM, Andrew Bialecki wrote:

> I don't think setting gc_grace_seconds to an hour is going to do what you'd expect. After
gc_grace_seconds, if you haven't run a repair within that hour, the data you deleted will
seem to have been undeleted.
> 
> Someone correct me if I'm wrong, but in order to order to completely delete data and
regain the space it takes up, you need to "delete" it, which creates tombstones, and then
run a repair on that column family within gc_grace_seconds. After that the data is actually
gone and the space reclaimed.
> 
> 
> On Tue, Jul 16, 2013 at 6:20 AM, 杨辉强 <huiqiangyang@yunrang.com> wrote:
> Thank you!
> It should be "update column family ScheduleInfoCF with gc_grace = 3600;"
> Faint.
> 
> ----- 原始邮件 -----
> 发件人: "杨辉强" <huiqiangyang@yunrang.com>
> 收件人: user@cassandra.apache.org
> 发送时间: 星期二, 2013年 7 月 16日 下午 6:15:12
> 主题: Re: Deletion use more space.
> 
> Hi,
>   I use the follow cmd to update gc_grace_seconds. It reports error! Why?
> 
> [default@WebSearch] update column family ScheduleInfoCF with gc_grace_seconds = 3600;
> java.lang.IllegalArgumentException: No enum const class org.apache.cassandra.cli.CliClient$ColumnFamilyArgument.GC_GRACE_SECONDS
> 
> 
> ----- 原始邮件 -----
> 发件人: "Michał Michalski" <michalm@opera.com>
> 收件人: user@cassandra.apache.org
> 发送时间: 星期二, 2013年 7 月 16日 下午 5:51:49
> 主题: Re: Deletion use more space.
> 
> Deletion is not really "removing" data, but it's adding tombstones
> (markers) of deletion. They'll be later merged with existing data during
> compaction and - in the end (see: gc_grace_seconds) - removed, but by
> this time they'll take some space.
> 
> http://wiki.apache.org/cassandra/DistributedDeletes
> 
> M.
> 
> W dniu 16.07.2013 11:46, 杨辉强 pisze:
> > Hi, all:
> >    I use cassandra 1.2.4 and I have 4 nodes ring and use byte order partitioner.
> >    I had inserted about 200G data in the ring previous days.
> >
> >    Today I write a program to scan the ring and then at the same time delete the
items that are scanned.
> >    To my surprise, the cassandra cost more disk usage.
> >
> >     Anybody can tell me why? Thanks.
> >
> 

Mime
View raw message