cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Ellis (JIRA)" <j...@apache.org>
Subject [jira] Updated: (CASSANDRA-604) Compactions might remove tombstones without removing the actual data
Date Tue, 08 Dec 2009 07:04:20 GMT

     [ https://issues.apache.org/jira/browse/CASSANDRA-604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Jonathan Ellis updated CASSANDRA-604:
-------------------------------------

    Component/s: Core
       Priority: Major  (was: Minor)

> Compactions might remove tombstones without removing the actual data
> --------------------------------------------------------------------
>
>                 Key: CASSANDRA-604
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-604
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>         Environment: Cent-OS
>            Reporter: Ramzi Rabah
>            Assignee: Jonathan Ellis
>             Fix For: 0.5
>
>         Attachments: 604.patch
>
>
> I was looking at the code for compaction, and noticed that when we are doing compactions
during the normal course of
> Cassandra, we call:
>            for (List<SSTableReader> sstables :
> getCompactionBuckets(ssTables_, 50L * 1024L * 1024L))
>            {
>                if (sstables.size() < minThreshold)
>                {
>                    continue;
>                }
>                other wise docompactions...
> where getCompactionBuckets puts in buckets very small files, or files
> that are 0.5-1.5 of each other's sizes. It will only compact those if
> they are >= minimum threshold which is 4 by default.
> So far so good. Now how about this scenario, I have an old entry that
> I inserted long time ago and that was compacted into a 75MB file.
> There are fewer 75MB files than 4. I do many deletes, and I end with 4
> extra sstable files filled with tombstones, each about 300 MB large.
> These 4 files are compacted together and in the compaction code, if
> the tombstone is there we don't copy it over to the new file. Now
> since we did not compact the 75MB files, but we compacted the
> tombstone files, that leaves us with the tombstone gone, but
> the data still intact in the 75MB file. If we compacted all the
> files together I don't think that would be a problem, but since we
> only compact 4, this potentially leaves data not cleaned.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message