cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sylvain Lebresne (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-4897) Allow tiered compaction define max sstable size
Date Fri, 21 Dec 2012 09:31:13 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-4897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13537772#comment-13537772
] 

Sylvain Lebresne commented on CASSANDRA-4897:
---------------------------------------------

bq. There needs to be some kind of heuristics when to compact bucket with max sized tables

In a way I agree, but that's pretty much what leveled compaction is. I'm of the opinion that
it might be better to spend time optimizing leveled compaction rather than spending time trying
to hack size tiered to do things it wasn't designed for and that kind of go against its nature.

bq. If oldest table is older then user configured number of hours, then run compaction

It'll kind of work, but it's a hack imho that has a number of downside in practice. The main
goal of compaction is to keep the number of compaction you need to look at for a read low
at all time. But by configuring a time between which sstables are compacted, you don't control
very well how much sstable may accumulate during that time. Meaning that this setting will
be hard to tune right for users downright to impossible to tune correctly if your write load
varies too much over time. And on the other side, if you set that setting too low, you will
compact sstables regularly even when they don't need to be.

Don't get me wrong, I buy that for certain workload and for certain value of maxSSTableSize
and 'configured number of hours between compaction', you could get something reasonably useful.
But we also have to consider the risk of foot shooting for users, and I'm not yet bought on
the fact that it's acceptable in that case.
                
> Allow tiered compaction define max sstable size
> -----------------------------------------------
>
>                 Key: CASSANDRA-4897
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4897
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Radim Kolar
>            Assignee: Radim Kolar
>             Fix For: 1.2.1
>
>         Attachments: cass-maxsize1.txt, cass-maxsize2.txt
>
>
> Lucene is doing same thing. Correctly configured max segment size will recycle old data
faster with less diskspace.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message