cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ney, Richard" <>
Subject Re: Has anyone deployed a production cluster with less than 6 nodes per DC?
Date Mon, 26 Dec 2016 23:58:04 GMT
Everyone, thank you for the responses

Jon, to answer your question we’re using the General Purpose SSD with IOPS of 1500/3000
so based on your definition I guess we’re using the awful ones since they aren’t provisioned
IOPS. We’re also trying G1 garbage collection.

I also just looked at our application setting overrides and it appears we are using CL=ONE
with RF=2 on both of the DCs. We’ve also disabled durable writes as shown in the keyspace
creation statement below

-          CREATE KEYSPACE reporting WITH replication = {'class': 'NetworkTopologyStrategy',
'us-east_dc1': '2', 'us-east_dc2': '2'}  AND durable_writes = false;

The main table we’re interacting with has these settings for compaction (These are Akka
persistence journal tables)

compaction = {'bucket_high': '1.5', 'bucket_low': '0.5', 'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy',
'enabled': 'true', 'max_threshold': '32', 'min_sstable_size': '50', 'min_threshold': '4',
'tombstone_compaction_interval': '86400', 'tombstone_threshold': '0.2', 'unchecked_tombstone_compaction':

We’re also planning to set a TTL of about 3 hours on the table since we’re using these
tables for business continuity so we don’t need the data to persist for long periods.

+1 (978) 848.6640 WORK
+1 (916) 846.2353 MOBILE


From: Jonathan Haddad <>
Reply-To: "" <>
Date: Monday, December 26, 2016 at 2:02 PM
To: "" <>
Subject: Re: Has anyone deployed a production cluster with less than 6 nodes per DC?

There's nothing wrong with running a 3 node DC.  A million writes an hour is averaging less
than 300 writes a second, which is pretty trivial.

Are you running provisioned SSD EBS volumes or the traditional, awful ones?

RF=2 with Quorum is kind of pointless, that's the same as CL=ALL.  Not recommended.  I don't
know why your timeouts are happening, but when they do, RF=2 w/ QUORUM is going to make the
problem worse.  Either use RF=3 or use CL=ONE.

Your management is correct here.  Throwing more hardware at this problem is the wrong solution
given that your current hardware should be able to handle over 100x what it's doing right

This email (including any attachments) is proprietary to Aspect Software, Inc. and may contain
information that is confidential. If you have received this message in error, please do not
read, copy or forward this message. Please notify the sender immediately, delete it from your
system and destroy any copies. You may not further disclose or distribute this email or its
View raw message