cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jack Krupansky <jack.krupan...@gmail.com>
Subject Re: compaction_throughput_mb_per_sec
Date Tue, 05 Jan 2016 22:20:58 GMT
I forwarded a comment to the docs team.

It appears that they picked the language up from the cassandra.yaml file
itself. Looking at use of system in that file, it seems that it usually
means the node, the box running the node.

-- Jack Krupansky

On Tue, Jan 5, 2016 at 9:50 AM, Ken Hancock <ken.hancock@schange.com> wrote:

> As to why I think it's cluster-wide, here's what the documentation says:
>
>
> https://docs.datastax.com/en/cassandra/1.2/cassandra/configuration/configCassandra_yaml_r.html
> compaction_throughput_mb_per_sec
> <https://docs.datastax.com/en/cassandra/1.2/cassandra/configuration/configCassandra_yaml_r.html?scroll=reference_ds_qfg_n1r_1k__compaction_throughput_mb_per_sec>
> (Default: 16 ) Throttles compaction to the specified total throughput
> across the entire system. The faster you insert data, the faster you need
> to compact in order to keep the SSTable count down. The recommended Value
> is 16 to 32 times the rate of write throughput (in MBs/second). Setting the
> value to 0 disables compaction throttling. Perhaps "across the entire
> system" means "across all keyspaces for this Cassandra node"?
>
> Compare the above documentation with the subsequent one which specifically
> calls out "a node":
>
> concurrent_compactors
> <https://docs.datastax.com/en/cassandra/1.2/cassandra/configuration/configCassandra_yaml_r.html?scroll=reference_ds_qfg_n1r_1k__concurrent_compactors>
> (Default: 1 per CPU core**) Sets the number of concurrent compaction
> processes allowed to run simultaneously on a node, not including validation
> compactions for anti-entropy repair. Simultaneous compactions help preserve
> read performance in a mixed read-write workload by mitigating the tendency
> of small SSTables to accumulate during a single long-running compaction. If
> compactions run too slowly or too fast, change
> compaction_throughput_mb_per_sec
> <https://docs.datastax.com/en/cassandra/1.2/cassandra/configuration/configCassandra_yaml_r.html#reference_ds_qfg_n1r_1k__compaction_throughput_mb_per_sec>
> first. I always thought it was per-node and I'm guessing this is a
> documentation lack of clarity issue.
>
> On Mon, Jan 4, 2016 at 5:06 PM, Jeff Jirsa <jeff.jirsa@crowdstrike.com>
> wrote:
>
>> Why do you think it’s cluster wide? That param is per-node, and you can
>> change it at runtime with nodetool (or via the JMX interface using jconsole
>> to ip:7199 )
>>
>>
>>
>> From: Ken Hancock
>> Reply-To: "user@cassandra.apache.org"
>> Date: Monday, January 4, 2016 at 12:59 PM
>> To: "user@cassandra.apache.org"
>> Subject: compaction_throughput_mb_per_sec
>>
>> I was surprised the other day to discover that this was a cluster-wide
>> setting.   Why does that make sense?
>>
>> In a heterogeneous cassandra deployment, say I have some old servers
>> running spinning disks and I'm bringing on more nodes that perhaps utilize
>> SSD.  I want to have different compaction throttling  on different nodes to
>> minimize read impact times.
>>
>> I can already balance data ownership through either token allocation or
>> vnode counts.
>>
>> Also, as I increase my node count, I technically also have to increase my
>> compaction_throughput which would require a rolling restart across the
>> cluster.
>>
>>
>>
>
>
>

Mime
View raw message