cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "sankalp kohli (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-7203) Flush (and Compact) High Traffic Partitions Separately
Date Tue, 02 Dec 2014 17:37:12 GMT


sankalp kohli commented on CASSANDRA-7203:

" and I think we have bigger fish to fry."
I agree with Jason here :).

I have not though about all the use cases we have but this is not currently a problem.  

> Flush (and Compact) High Traffic Partitions Separately
> ------------------------------------------------------
>                 Key: CASSANDRA-7203
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Benedict
>              Labels: compaction, performance
> An idea possibly worth exploring is the use of streaming count-min sketches to collect
data over the up-time of a server to estimating the velocity of different partitions, so that
high-volume partitions can be flushed separately on the assumption that they will be much
smaller in number, thus reducing write amplification by permitting compaction independently
of any low-velocity data.
> Whilst the idea is reasonably straight forward, it seems that the biggest problem here
will be defining any success metric. Obviously any workload following an exponential/zipf/extreme
distribution is likely to benefit from such an approach, but whether or not that would translate
in real terms is another matter.

This message was sent by Atlassian JIRA

View raw message