cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sylvain Lebresne (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-5182) Deletable rows are sometimes not removed during compaction
Date Thu, 24 Jan 2013 08:43:15 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-5182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13561502#comment-13561502
] 

Sylvain Lebresne commented on CASSANDRA-5182:
---------------------------------------------

bq. Maybe it is better to check if fp_chance is high before going through index file

Actually, I agree with Yuki on that and I'm kind of -1 on the patch in his current form. The
current patch means that whatever your fp_chance is, each time the row is indeed present in
a non compacted sstable (which does prevent gcing the row for this compaction but is not something
that will necessarily be rare) might hit the disk (unless the key cache save you). So I'd
be in favor of using getPosition only if fp_chance == 1, at least on 1.1 as we have no idea
of the impact this can have on people that haven't disabled bloom filter and have no problem
whatsoever with gcing tombstone.

As a side note, I've opened CASSANDRA-5183 that is related to this purge tombstone problem.
                
> Deletable rows are sometimes not removed during compaction
> ----------------------------------------------------------
>
>                 Key: CASSANDRA-5182
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5182
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.1.5
>            Reporter: Binh Van Nguyen
>            Assignee: Yuki Morishita
>             Fix For: 1.1.10, 1.2.1
>
>         Attachments: 5182-1.1.txt, test_ttl.tar.gz
>
>
> Our use case is write heavy and read seldom.  To optimize the space used, we've set the
bloom_filter_fp_ratio=1.0  That along with the fact that each row is only written to one time
and that there are more than 20 SSTables keeps the rows from ever being compacted. Here is
the code:
> https://github.com/apache/cassandra/blob/cassandra-1.1/src/java/org/apache/cassandra/db/compaction/CompactionController.java#L162
> We hit this conner case and because of this C* keeps consuming more and more space on
disk while it should not.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message