cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeff Jirsa <>
Subject Re: Slow performance after upgrading from 2.0.9 to 2.1.11
Date Thu, 14 Jan 2016 19:58:36 GMT
Sorry I wasn’t as explicit as I should have been

The same buffer size is used by compressed reads as well, but tuned with compression_chunk_size
table property. It’s likely true that if you lower compression_chunk_size, you’ll see
improved read performance. 

This was covered in the AWS re:Invent youtube link I sent in my original reply.

From:  "Peddi, Praveen"
Reply-To:  ""
Date:  Thursday, January 14, 2016 at 11:36 AM
To:  "", Zhiyan Shao
Cc:  "Agrawal, Pratik"
Subject:  Re: Slow performance after upgrading from 2.0.9 to 2.1.11

We will try with reduced “rar_buffer_size” to 4KB. However CASSANDRA-10249 says "this
only affects users who have 1. disabled compression, 2. switched to buffered i/o from mmap’d”.
None of this is true for us I believe. We use default disk_access_mode which should be mmap.
We also used LZ4Compressor when created table.

We will let you know if this property had any effect. We were testing with 2.1.11 and this
was only fixed in 2.1.12 so we need to play with latest version.


From: Jeff Jirsa <>
Reply-To: <>
Date: Thursday, January 14, 2016 at 1:29 PM
To: Zhiyan Shao <>, "" <>
Cc: "Agrawal, Pratik" <>
Subject: Re: Slow performance after upgrading from 2.0.9 to 2.1.11

This may be due to /
- whether or not this is really the case depends on how much of your data is in page cache,
and whether or not you’re using mmap. Since the original question was asked by someone using
small RAM instances, it’s possible. 

We mitigate this by dropping compression_chunk_size in order to force a smaller buffer on
reads, so we don’t over read very small blocks. This has other side effects (lower compression
ratio, more garbage during streaming), but significantly speeds up read workloads for us.

From: Zhiyan Shao
Date: Thursday, January 14, 2016 at 9:49 AM
To: ""
Cc: Jeff Jirsa, "Agrawal, Pratik"
Subject: Re: Slow performance after upgrading from 2.0.9 to 2.1.11

Praveen, if you search "Read is slower in 2.1.6 than 2.0.14" in this forum, you can find another
thread I sent a while ago. The perf test I did indicated that read is slower for 2.1.6 than
2.0.14 so we stayed with 2.0.14.

On Tue, Jan 12, 2016 at 9:35 AM, Peddi, Praveen <> wrote:
Thanks Jeff for your reply. Sorry for delayed response. We were running some more tests and
wanted to wait for the results.

So basically we saw higher CPU with 2.1.11 was higher compared to 2.0.9 (see below) for the
same exact load test. Memory spikes were also aggressive on 2.1.11.

So we wanted to rule out any of our custom setting so we ended up doing some testing with
Cassandra stress test and default Cassandra installation. Here are the results we saw between
2.0.9 and 2.1.11. Both are default installations and both use Cassandra stress test with same
params. This is the closest apple-apple comparison we can get. As you can see both read and
write latencies are 30 to 50% worse in 2.1.11 than 2.0.9. Since we are using default installation.

Highlights of the test:
Load: 2x reads and 1x writes
CPU:  2.0.9 (goes upto 25%)  compared to 2.1.11 (goes upto 60%)
Local read latency: 0.039 ms for 2.0.9 and 0.066 ms for 2.1.11

Local write Latency: 0.033 ms for 2.0.9 Vs 0.030 ms for 2.1.11

One observation is, As the number of threads are increased, 2.1.11 read latencies are getting
worse compared to 2.0.9 (see below table for 24 threads vs 54 threads)

Not sure if anyone has done this kind of comparison before and what their thoughts are. I
am thinking for this same reason 

2.0.9 Plain type      total ops    op/s    pk/s   row/s    mean     med0.950.990.999     max
 16 threadCount READ668547205720572051.
 16 threadCount WRITE331463572357235721.312.63.37206.59.3
 16 threadCount total1000001077710777107771.
2.1.11 Plain            
 16 threadCount READ670966818681868181.
 16 threadCount WRITE329043344334433441.41.32.336.556.79.8
 16 threadCount total1000001016210162101621.
2.0.9 Plain            
 24 threadCount READ6641481678167816721.63.77.516.72088.1
 24 threadCount WRITE335864130413041301.
 24 threadCount total1000001229712297122971.
2.1.11 Plain            
 24 threadCount READ666287433743374332.
 24 threadCount WRITE3337237233723372321.93.13.821.937.29
 24 threadCount total1000001115511155111552.
2.0.9 Plain            
 54 threadCount READ671151341913419134192.
 54 threadCount WRITE328856575657565752.
 54 threadCount total1000001999319993199932.
2.1.11 Plain            
 54 threadCount READ667808951895189514.
 54 threadCount WRITE332204453445344533.
 54 threadCount total10000013404134041340443.76.69.24869.97.5

From: Jeff Jirsa <>
Date: Thursday, January 7, 2016 at 1:01 AM
To: "" <>, Peddi Praveen <>
Subject: Re: Slow performance after upgrading from 2.0.9 to 2.1.11

Anecdotal evidence typically agrees that 2.1 is faster than 2.0 (our experience was anywhere
from 20-60%, depending on workload).

However, it’s not necessarily true that everything behaves exactly the same – in particular,
memtables are different, commitlog segment handling is different, and GC params may need to
be tuned differently for 2.1 than 2.0.

When the system is busy, what’s it actually DOING? Cassandra exposes a TON of metrics –
have you plugged any into a reporting system to see what’s going on? Is your latency due
to pegged cpu, iowait/disk queues or gc pauses? 

My colleagues spent a lot of time validating different AWS EBS configs (video from reinvent
at, 2.1 was faster in almost every case, but
you’re using an instance size I don’t believe we tried (too little RAM to be viable in
production).  c3.2xl only gives you 15G of ram – most “performance” based systems want
2-4x that (people running G1 heaps usually start at 16G heaps and leave another 16-30G for
page cache), you’re running fairly small hardware – it’s possible that 2.1 isn’t “as
good” on smaller hardware. 

(I do see your domain, presumably you know all of this, but just to be sure):

You’re using c3, so presumably you’re using EBS – are you using GP2? Which volume sizes?
Are they the same between versions? Are you hitting your iops limits? Running out of burst
tokens? Do you have enhanced networking enabled? At load, what part of your system is stressed?
Are you cpu bound? Are you seeing GC pauses hurt latency? Have you tried changing memtable_allocation_type
-> offheap objects  (available in 2.1, not in 2.0)? 

Tuning gc_grace is weird – do you understand what it does? Are you overwriting or deleting
a lot of data in your test (that’d be unusual)? Are you doing a lot of compaction?

From: "Peddi, Praveen"
Reply-To: ""
Date: Wednesday, January 6, 2016 at 11:41 AM
To: ""
Subject: Slow performance after upgrading from 2.0.9 to 2.1.11

We have upgraded Cassandra from 2.0.9 to 2.1.11 in our loadtest environment with pretty much
same yaml settings in both (removed unused yaml settings and renamed few others) and we have
noticed performance on 2.1.11 is worse compared to 2.0.9. After more investigation we found
that the performance gets worse as we increase replication factor on 2.1.11 where as on 2.0.9
performance is more or less same. Has anything architecturally changed as far as replication
is concerned in 2.1.11?

All googling only suggested 2.1.11 should be FASTER than 2.0.9 so we are obviously doing something
different. However the client code, load test is all identical in both cases.

Nodes: 3 ec2 c3.2x large
R/W Consistency: QUORUM
Renamed memtable_total_space_in_mb to memtable_heap_space_in_mb and removed unused properties
from yaml file.
We run compaction aggressive compaction with low gc_grace (15 mins) but this is true for both
2.0.9 and 2.1.11.

As you can see, all p50, p90 and p99 latencies stayed with in 10% difference on 2.0.9 when
we increased RF from 1 to 3, where as on 2.1.11 latencies almost doubled (especially reads
are much slower than writes).

# Nodes RF# of rows2.
Any pointers on how to debug performance issues will be appreciated.


View raw message