According to this:

Bloom filter is still on by default for LCS in 1.2.X


From: "Hiller, Dean" <>
To: "" <>
Sent: Monday, March 4, 2013 10:42 AM
Subject: Re: Poor read latency

Recommended settings are 8G RAM and your memory grows with the number of rows through index samples(configured in cassandra.yaml as samples per row something…look for the word index).  Also, bloomfilters grow with RAM if using size tiered compaction.  We are actually trying to switch to leveled compaction in 1.2.2 as I think the default is no bloomfilters as LCS does not "really" need them I think since 90% of rows are in highest tier(but this just works better for certain type profiles like very heavy read vs. the number of writes).


From: Tom Martin <<>>
Reply-To: "<>" <<>>
Date: Monday, March 4, 2013 11:20 AM
To: "<>" <<>>
Subject: Re: Poor read latency

Yeah, I just checked and the heap size 0.75 warning has been appearing.

nodetool info reports:

Heap Memory (MB) : 563.88 / 1014.00
Heap Memory (MB) : 646.01 / 1014.00
Heap Memory (MB) : 639.71 / 1014.00

We have plenty of free memory on each instance.  Do we need bigger instances or should we just configure each node to have a bigger max heap?

On Mon, Mar 4, 2013 at 6:10 PM, Hiller, Dean <<>> wrote:
What is nodetool info say for your memory?  (we hit that one with memory near the max and it slowed down our system big time…still working on resolving it too).

Do any logs have the hit 0.75, running compaction OR worse hit 0.85 running compaction….you get that if the above is the case typically.


From: Tom Martin <<><<>>>
Reply-To: "<><<>>" <<><<>>>
Date: Monday, March 4, 2013 10:31 AM
To: "<><<>>" <<><<>>>
Subject: Poor read latency

Hi all,

We have a small (3 node) cassandra cluster on aws.  We have a replication factor of 3, a read level of local_quorum and are using the ephemeral disk.  We're getting pretty poor read performance and quite high read latency in cfstats.  For example:

Column Family: AgentHotel
SSTable count: 4
Space used (live): 829021175
Space used (total): 829021175
Number of Keys (estimate): 2148352
Memtable Columns Count: 0
Memtable Data Size: 0
Memtable Switch Count: 0
Read Count: 67204
Read Latency: 23.813 ms.
Write Count: 0
Write Latency: NaN ms.
Pending Tasks: 0
Bloom Filter False Positives: 50
Bloom Filter False Ratio: 0.00201
Bloom Filter Space Used: 7635472
Compacted row minimum size: 259
Compacted row maximum size: 4768
Compacted row mean size: 873

For comparison we have a similar set up in another cluster for an old project (hosted on rackspace) where we're getting sub 1ms read latencies.  We are using multigets on the client (Hector) but are only requesting ~40 rows per request on average.

I feel like we should reasonably expect better performance but perhaps I'm mistaken.  Is there anything super obvious we should be checking out?