cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sylvain Lebresne (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-3354) tombstone not removed after compaction
Date Thu, 13 Oct 2011 09:23:12 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-3354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13126455#comment-13126455
] 

Sylvain Lebresne commented on CASSANDRA-3354:
---------------------------------------------

No, Expiring columns don't need two compactions to get gc'ed. The conversion of expiring column
to tombstone is only a space optimization (to potentially gain the space of the column value
during the usually fairly long gc_grace period), but it changes nothing to when the column
is gc'ed.

So, an expired expiring column is gc'ed as soon as it can, in one shot (I just tried it to
be sure and it works).

Now, as for what you are seeing, I'm not sure what it is but here's some thinks to check:
  * for a expired column to be gc'ed during a compaction, it needs to be gcable at the *start*
of the compaction (same for tombstone actually). That could make a difference on long running
compaction (and yes, we could probably improve that but I doubt this has a big impact in practice).
  * related to the previous, expiring columns are converted to tombstone at read time. This
is true for the reads done by sstable2json in particular. This means that when sstable2json
shows you a tombstone, it could be that inside the sstable, it's actually an expired column
and it turns out that this column was not expired yet at the time of the compaction.
  * only major compactions are guaranteed to gc all tombstones. Though if you've used 'nodetool
compact' then you've triggered a major one.
                
> tombstone not removed after compaction
> --------------------------------------
>
>                 Key: CASSANDRA-3354
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3354
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Yang Yang
>            Assignee: Sylvain Lebresne
>            Priority: Minor
>
> I set GC_grace to 2 hours, for testing.
> then I compacted the sstables using nodecmd,
> but the resulting sstables contained many Deletion records older than 2 hours
> "0000000000000d5e3263303666346331000000000000000100000000":
> [["00000132f8820139303030303030303030303030303030303030303030303030303030303030303030303263303666346332","4e95a659",1318429297125,"d"]],
> yyang@ip-10-71-86-162:~/src/svn/whisky$ perl -e 'print gmtime(1318429297)."\n" '
> Wed Oct 12 14:21:37 2011
> -rw-r--r-- 1 yyang yyang 381366163 2011-10-12 16:39
> /mnt/cass/lib/cassandra/data/testBudget_items/multi_click_filter-h-511-Data.db
> but it seems that after running a few more compactions, these records are gone

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message