cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Benjamin Roth <>
Subject Re: High disk io read load
Date Mon, 20 Feb 2017 20:39:18 GMT
Hah! Found the problem!

After setting read_ahead to 0 and compression chunk size to 4kb on all CFs,
the situation was PERFECT (nearly, please see below)! I scrubbed some CFs
but not the whole dataset, yet. I knew it was not too few RAM.

Some stats:
- Latency of a quite large CF:
- Disk throughput:
- Dstat:
- This shows, that the request distribution remained the same, so no
dyn-snitch magic:

Btw. I stumbled across this one:!topic/scylladb-dev/j_qXSP-6-gY
Maybe we should also think about lowering default chunk length.

*Unfortunately schema changes had a disturbing effect:*
- I changed the chunk size with a script, so there were a lot of schema
changes in a small period.
- After all tables were changed, one of the seed hosts (cas1) went TOTALLY
- Latency on this host was 10x of all other hosts.
- There were more ParNew GCs.
- Load was very high (up to 80, 100% CPU)
- Whole system was unstable due to unpredictable latencies and
backpressures (
- Even SELECT * FROM system_schema.table etc appeared as slow query in the
- It was the 1st server in the connect host list for the PHP client
- CS restart didn't help. Reboot did not help (cold page cache made it
probably worse).
- All other nodes were totally ok.
- Stopping CS on cas1 helped to keep the system stable. Brought down
latency again, but was no solution.

=> Only replacing the node (with a newer, faster node) in the connect-host
list helped that situation.

Any ideas why changing schemas and/or chunk size could have such an effect?
For some time the situation was really critical.

2017-02-20 10:48 GMT+01:00 Bhuvan Rawal <>:

> Hi Benjamin,
> Yes, Read ahead of 8 would imply more IO count from disk but it should not
> cause more data read off the disk as is happening in your case.
> One probable reason for high disk io would be because the 512 vnode has
> less page to RAM ratio of 22% (100G buff /437G data) as compared to 46%
> (100G/237G). And as your avg record size is in bytes for every disk io you
> are fetching complete 64K block to get a row.
> Perhaps you can balance the node by adding equivalent RAM ?
> Regards,
> Bhuvan

View raw message