We have a small (3 node) cassandra cluster on aws. We have a replication factor of 3, a read level of local_quorum and are using the ephemeral disk. We're getting pretty poor read performance and quite high read latency in cfstats. For example:
Column Family: AgentHotel
SSTable count: 4
Space used (live): 829021175
Space used (total): 829021175
Number of Keys (estimate): 2148352
Memtable Columns Count: 0
Memtable Data Size: 0
Memtable Switch Count: 0
Read Count: 67204
Read Latency: 23.813 ms.
Write Count: 0
Write Latency: NaN ms.
Pending Tasks: 0
Bloom Filter False Positives: 50
Bloom Filter False Ratio: 0.00201
Bloom Filter Space Used: 7635472
Compacted row minimum size: 259
Compacted row maximum size: 4768
Compacted row mean size: 873
For comparison we have a similar set up in another cluster for an old project (hosted on rackspace) where we're getting sub 1ms read latencies. We are using multigets on the client (Hector) but are only requesting ~40 rows per request on average.
I feel like we should reasonably expect better performance but perhaps I'm mistaken. Is there anything super obvious we should be checking out?