cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kurt Greaves (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-13258) Rethink read-time defragmentation introduced in 1.1 (CASSANDRA-2503)
Date Mon, 27 Feb 2017 11:01:45 GMT


Kurt Greaves commented on CASSANDRA-13258:

I don't think it's reasonable to expect to hit less than min threshold SSTables when using
STCS. STCS to me is "this strategy kind of just works but is by no means perfect, but that
doesn't matter to us". I can definitely see how this kind of behaviour would seriously make
bad situations worse however, and I think the optimisation use case would be pretty rare.

That being said it would be interesting to see some benchmarks associated with removing it.
If there is a use case where it helps then some kind of middle-ground where only hot rows
are re-written would be cool and probably reduce the impact on compactions.

> Rethink read-time defragmentation introduced in 1.1 (CASSANDRA-2503)
> --------------------------------------------------------------------
>                 Key: CASSANDRA-13258
>                 URL:
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Nate McCall
> tl,dr; we issue a Mutation(!) on a read when using STCS and there are more than minCompactedThreshold
SSTables encountered by the iterator. (See org/apache/cassandra/db/
> I can see a couple of use cases where this *might* be useful, but from a practical stand
point, this is an excellent way to exacerbate compaction falling behind.
> With the introduction of other, purpose built compaction strategies, I would be interested
to hear why anyone would consider this still a good idea. Note that we only do it for STCS
so at best, we are inconsistent. 
> There are some interesting comments on CASSANDRA-10342 regarding this as well.

This message was sent by Atlassian JIRA

View raw message