cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alain RODRIGUEZ (JIRA)" <>
Subject [jira] [Created] (CASSANDRA-9509) Streams throughput control
Date Fri, 29 May 2015 10:45:17 GMT
Alain RODRIGUEZ created CASSANDRA-9509:

             Summary: Streams throughput control
                 Key: CASSANDRA-9509
             Project: Cassandra
          Issue Type: Improvement
          Components: Config
            Reporter: Alain RODRIGUEZ
            Priority: Minor

Currently, I have to keep tuning stream throughput all the time manually (through nodetool
setstreamthroughput) since the same value stands for example for a decommission or a removenode
(for exemple). The point is in first case Network goes from 1 --> N nodes (and is obviously
limited by the node sending), in the second it is a N --> N nodes (N being number of remaining
nodes). Removing node, throughput limit will not be reached in most cases, and all the nodes
will be under heavy load. So with the same value of stream throughput, we send N times faster
on a removenode than using decommission. 

An other exemple is repair is also faster as  more nodes start repairing (we have 20 nodes,
taking 2+ days to repair data, and repair have to run within 10 days, can't be one at the
time, and stream throughput needs to be adjusted accordingly.

Is there a way to:

- limit incoming network on a node ?
- limit cluster wide sent network ?
- make streaming processes background task (using remaining resources) ? This looks harder
to me since the bottleneck depends on the node hardware and the workload. It can be either
the CPU, the network, the disk throughput or even the memory...  

If none of those ideas are doable, can we imagine to dissociate stream throughputs depending
on the operation, to configure them individually in cassandra.yaml ?

This message was sent by Atlassian JIRA

View raw message