cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "sankalp kohli (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-7203) Flush (and Compact) High Traffic Partitions Separately
Date Tue, 02 Dec 2014 17:37:12 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-7203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14231795#comment-14231795
] 

sankalp kohli commented on CASSANDRA-7203:
------------------------------------------

" and I think we have bigger fish to fry."
I agree with Jason here :).

I have not though about all the use cases we have but this is not currently a problem.  

> Flush (and Compact) High Traffic Partitions Separately
> ------------------------------------------------------
>
>                 Key: CASSANDRA-7203
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7203
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Benedict
>              Labels: compaction, performance
>
> An idea possibly worth exploring is the use of streaming count-min sketches to collect
data over the up-time of a server to estimating the velocity of different partitions, so that
high-volume partitions can be flushed separately on the assumption that they will be much
smaller in number, thus reducing write amplification by permitting compaction independently
of any low-velocity data.
> Whilst the idea is reasonably straight forward, it seems that the biggest problem here
will be defining any success metric. Obviously any workload following an exponential/zipf/extreme
distribution is likely to benefit from such an approach, but whether or not that would translate
in real terms is another matter.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message