cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Blake Eggleston <bl...@shift.com>
Subject massive spikes in read latency
Date Sun, 05 Jan 2014 23:28:51 GMT
Hi,

I’ve been having a problem with 3 neighboring nodes in our cluster having their read latencies
jump up to 9000ms - 18000ms for a few minutes (as reported by opscenter), then come back down.

We’re running a 6 node cluster, on AWS hi1.4xlarge instances, with cassandra reading and
writing to 2 raided ssds.

I’ve added 2 nodes to the struggling part of the cluster, and aside from the latency spikes
shifting onto the new nodes, it has had no effect. I suspect that a single key that lives
on the first stressed node may be being read from heavily.

The spikes in latency don’t seem to be correlated to an increase in reads. The cluster’s
workload is usually handling a maximum workload of 4200 reads/sec per node, with writes being
significantly less, at ~200/sec per node. Usually it will be fine with this, with read latencies
at around 3.5-10 ms/read, but once or twice an hour the latencies on the 3 nodes will shoot
through the roof. 

The disks aren’t showing serious use, with read and write rates on the ssd volume at around
1350 kBps and 3218 kBps, respectively. Each cassandra process is maintaining 1000-1100 open
connections. GC logs aren’t showing any serious gc pauses.

Any ideas on what might be causing this?

Thanks,

Blake
Mime
View raw message