incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sylvain Lebresne <sylv...@yakaz.com>
Subject Re: column expiration and rows in 0.7
Date Thu, 23 Sep 2010 08:07:50 GMT
A few things:
  1- in your case, the expiring columns are gc-able 5 hours after the
column expires. So
      roughly 9 hours after the column insertion, not 5.
  2- columns are not necessarily removed after gc_grace_seconds
elapses exactly. They
      get removed by the first *major* compaction that is triggered
after that time elapses.
      And major compactions are not run automatically. You'll have to
trigger one through
      JMX (with nodetool for instance).
  3- for info, as Stu said, in recent trunk, some tombstone are deleted by minor
      compactions. But it requires some conditions that may not be met
in you case. So you
      need major compaction to be sure.

--
Sylvain

On Thu, Sep 23, 2010 at 9:56 AM, Alaa Zubaidi <alaa.zubaidi@pdf.com> wrote:
>  Thanks..
> but, in 0.7 every CF has its own "GCgraceSeconds" which is gc_grace_seconds,
> and I am setting gc_grace_seconds to 5 hours and the columns "ttl" to 4
> hours, this means that after 5 hours the columns should be removed, and the
> keys are removed too, right?, however, I still see the keys and the size of
> the data is always growing?
>
> Alaa
>
> On 9/22/2010 8:00 PM, Stu Hood wrote:
>>
>> Minor compactions will often be able to perform this garbage collection as
>> well in 0.6.6 and 0.7.0 due to a great optimization implemented by Sylvain:
>>
>> https://issues.apache.org/jira/browse/CASSANDRA-1074
>>
>> -----Original Message-----
>> From: "Aaron Morton"<aaron@thelastpickle.com>
>> Sent: Wednesday, September 22, 2010 7:47pm
>> To: "user@cassandra.apache.org"<user@cassandra.apache.org>
>> Subject: Re: column expiration and rows in 0.7
>>
>> The data will only be physically deleted when a major compaction runs and
>> the GCGraceSeconds has passed. You need to trigger the compaction using node
>> tool.
>>
>> http://wiki.apache.org/cassandra/DistributedDeletes
>>
>> Aaron
>> On 23 Sep 2010, at 12:14, Alaa Zubaidi<alaa.zubaidi@pdf.com>  wrote:
>>
>>> Hi,
>>> I am expecting my data size to be around nGB. However, it keeps growing
>>> and growing.
>>>
>>> I am setting the gc_grace_seconds for the CF to 5 hours, and I am also
>>> setting "ttl" for all columns on a row and expecting that these columns will
>>> be "deleted" after the ttl time, and will be "removed" after
>>> gc_grace_seonds, and I was told that the if ALL columns are deleted the
>>> whole row will be deleted as well? is this true or not?
>>>
>>> Thanks,
>>>
>>> Alaa Zubaidi
>>>
>>> PDF Solutions, Inc.
>>> 333 West San Carlos Street, Suite 700
>>> San Jose, CA 95110  USA
>>> Tel: 408-283-5639 (or 408-280-7900 x5639)
>>> fax: 408-938-6479
>>> email: alaa.zubaidi@pdf.com
>>>
>>>
>>
>>
>>
>
> --
> Alaa Zubaidi
> PDF Solutions, Inc.
> 333 West San Carlos Street, Suite 700
> San Jose, CA 95110  USA
> Tel: 408-283-5639 (or 408-280-7900 x5639)
> fax: 408-938-6479
> email: alaa.zubaidi@pdf.com
>
>
>

Mime
View raw message