incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: Deletion use more space.
Date Wed, 17 Jul 2013 10:02:45 GMT
you are seeing this http://wiki.apache.org/cassandra/FAQ#range_ghosts

Lots of client API's and CQL 3 hide this from you now. 

Cheers
 
-----------------
Aaron Morton
Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 17/07/2013, at 1:52 PM, 杨辉强 <huiqiangyang@yunrang.com> wrote:

> Thanks, But Michael's answer confuse me more. 
> 
> I use list cf; in cassandra-cli. It seems lots of rows have been deleted, but keys exist.
> 
> After the deletion, why the key still exists? It seems useless.
> 
> RowKey: 3030303031306365633862356437636365303861303433343137656531306435
> -------------------
> RowKey: 3030303031316333616336366531613636373735396363323037396331613230
> -------------------
> RowKey: 3030303031316333616336366531613637303964616364363630663865313433
> -------------------
> RowKey: 3030303031323934613637303239323563633133303238626330646666626335
> -------------------
> RowKey: 3030303031323934613637303239323566303733303638373138366334323436
> -------------------
> RowKey: 3030303031333838333139303930633664643364613331316664363134656639
> -------------------
> RowKey: 3030303031336265343639303630613938376333366230363439316336333230
> -------------------
> RowKey: 3030303031336365653735376465616334633932333363363832653130363733
> -------------------
> RowKey: 3030303031343632343261363966376464656235373266663761633233353065
> 
> 
> ----- 原始邮件 -----
> 发件人: "Michael Theroux" <mtheroux2@yahoo.com>
> 收件人: user@cassandra.apache.org
> 发送时间: 星期二, 2013年 7 月 16日 下午 10:23:32
> 主题: Re: Deletion use more space.
> 
> The only time information is removed from the filesystem is during compaction.  Compaction
can remove tombstones after gc_grace_seconds, which, could result in reanimation of deleted
data if the tombstone was never properly replicated to other replicas.  Repair will make sure
tombstones are consistent amongst replicas.  However, tombstones can not be removed if the
data the tombstone is deleting is in another SSTable and has not yet been removed. 
> 
> Hope this helps,
> -Mike
> 
> 
> On Jul 16, 2013, at 10:04 AM, Andrew Bialecki wrote:
> 
>> I don't think setting gc_grace_seconds to an hour is going to do what you'd expect.
After gc_grace_seconds, if you haven't run a repair within that hour, the data you deleted
will seem to have been undeleted.
>> 
>> Someone correct me if I'm wrong, but in order to order to completely delete data
and regain the space it takes up, you need to "delete" it, which creates tombstones, and then
run a repair on that column family within gc_grace_seconds. After that the data is actually
gone and the space reclaimed.
>> 
>> 
>> On Tue, Jul 16, 2013 at 6:20 AM, 杨辉强 <huiqiangyang@yunrang.com> wrote:
>> Thank you!
>> It should be "update column family ScheduleInfoCF with gc_grace = 3600;"
>> Faint.
>> 
>> ----- 原始邮件 -----
>> 发件人: "杨辉强" <huiqiangyang@yunrang.com>
>> 收件人: user@cassandra.apache.org
>> 发送时间: 星期二, 2013年 7 月 16日 下午 6:15:12
>> 主题: Re: Deletion use more space.
>> 
>> Hi,
>>  I use the follow cmd to update gc_grace_seconds. It reports error! Why?
>> 
>> [default@WebSearch] update column family ScheduleInfoCF with gc_grace_seconds = 3600;
>> java.lang.IllegalArgumentException: No enum const class org.apache.cassandra.cli.CliClient$ColumnFamilyArgument.GC_GRACE_SECONDS
>> 
>> 
>> ----- 原始邮件 -----
>> 发件人: "Michał Michalski" <michalm@opera.com>
>> 收件人: user@cassandra.apache.org
>> 发送时间: 星期二, 2013年 7 月 16日 下午 5:51:49
>> 主题: Re: Deletion use more space.
>> 
>> Deletion is not really "removing" data, but it's adding tombstones
>> (markers) of deletion. They'll be later merged with existing data during
>> compaction and - in the end (see: gc_grace_seconds) - removed, but by
>> this time they'll take some space.
>> 
>> http://wiki.apache.org/cassandra/DistributedDeletes
>> 
>> M.
>> 
>> W dniu 16.07.2013 11:46, 杨辉强 pisze:
>>> Hi, all:
>>>   I use cassandra 1.2.4 and I have 4 nodes ring and use byte order partitioner.
>>>   I had inserted about 200G data in the ring previous days.
>>> 
>>>   Today I write a program to scan the ring and then at the same time delete the
items that are scanned.
>>>   To my surprise, the cassandra cost more disk usage.
>>> 
>>>    Anybody can tell me why? Thanks.
>>> 
>> 


Mime
View raw message