cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anuj Wadehra <anujw_2...@yahoo.co.in>
Subject Re: GC and compaction behaviour in a multi-DC environment
Date Wed, 16 Dec 2015 17:33:26 GMT
Hi Vasileios,

My comments:


1. I am not certain that Heap utilisation as seen from the graphs is healthy. 


Also, I'm not sure if the low utilisation of the survivor spaces as seen above is expected
or not during GC activity. How do these two things relate (GC - compactions)?


 What makes old generation keep on increasing when the survivors are underutilised (~ 5%)?


Anuj:I think your new gen n tenuring threshold is too small. Memtable and compaction objects
may move to old gen too quickly as survivor doesnt have appropriate space. By default, memtable_total_space_in_mb
 is 1/4 which means 2Gb for 8gb heap. Moreover 8 gb heap allocation on a 16gb system is high.


Even though GC pauses are not huge, I would suggest you to try following settings:


Memtable_total_space_in_mb=500

Above setting will lead to more IO but on SSD thats ok.


MAX_HEAP_SIZE="6G"
HEAP_NEWSIZE="1200MB"
-XX:SurvivorRatio=2
-XX:MaxTenuringThreshold=8
-XX:CMSInitiatingOccupancyFraction=50
JVM_OPTS="$JVM_OPTS -XX:+UnlockDiagnosticVMOptions"
JVM_OPTS="$JVM_OPTS -XX:+UseGCTaskAffinity"
JVM_OPTS="$JVM_OPTS -XX:+BindGCTaskThreadsToCPUs"
JVM_OPTS="$JVM_OPTS -XX:ParGCCardsPerStrideChunk=16384"
JVM_OPTS="$JVM_OPTS -XX:+CMSScavengeBeforeRemark"
JVM_OPTS="$JVM_OPTS -XX:CMSMaxAbortablePrecleanTime=30000"
JVM_OPTS="$JVM_OPTS -XX:+CMSEdenChunksRecordAlways"
JVM_OPTS="$JVM_OPTS -XX:+CMSParallelInitialMarkEnabled"
JVM_OPTS="$JVM_OPTS -XX:-UseBiasedLocking"


I'd appreciate your input on this. In additio, is it normal for DC2 to have such a huge difference
in GC activity in comparison to DC1? 


Anuj:I dont think that major GC differences are possible with same setups n workload. Make
sure that you follow all DataStax recommended settings at: https://docs.datastax.com/en/cassandra/2.0/cassandra/install/installRecommendSettings.html


2. Compaction activity (as in frequency of compactions, not number of...) seems to be comparable
for both DCs (with the exception of point 3 above), so I'm not sure why the number of Data.db
files is consistently higher in DC2. Is this something important or it's a minor detail I
shouldn't care about?


Anuj:.10 n 15 is not significant. You can ignore it.

Thanks

Anuj

From:"Vasileios Vlachos" <vasileiosvlachos@gmail.com>
Date:Wed, 16 Dec, 2015 at 10:16 am
Subject:Re: GC and compaction behaviour in a multi-DC environment

Thanks for your reply,


Apologies, I didn't explain that properly... When I say average, I mean the average of the
10 samples that appear in the logs, not the average across all GCs that happen over time.
And the same applies for the both DCs.


Tenuring threshold has been left to the default value which if I remember correctly must be
1 (I'll check again tomorrow). I was hoping that the new generation is adequate for the tenuring
threshold to not be an issue.


But from all the searching that I've done, when people have GC issues, their graphs make it
obvious that there is something wrong. My problem is that I have too many unknowns at the
moment to conclude that there is something wrong. All I've done is report what I see and keep
on investigating whilst I asked for some input here.


On Tue, Dec 15, 2015 at 8:06 PM, Kai Wang <depend@gmail.com> wrote:

Check MaxTenuringThreshold in your cassandra-env.sh. If that threshold is too low, objects
will be moved to old gen too quickly.

I am little confused by your GC numbers on DC1. If DC1 only exceeds 200ms GC threshold less
than 10 times for 4 days, how can its average GC duration be 400ms? Did I miss anything here?


On Tue, Dec 15, 2015 at 6:09 AM, Vasileios Vlachos <vasileiosvlachos@gmail.com> wrote:

Hello,

We are running Cassandra 2.0.16 across 2 DCs at the moment, each of which has 4 nodes. Replication
factor is 3 for all KS and all applications write/read using LOCAL_QUORUM. So, if DC1 is what's
regarded "local", then DC2 gets all writes asynchronously. Nothing writes directly to DC2,
traffic flows only from Clients -> DC1 -> DC2. Cassandra runs on physical servers at
DC1, whereas at DC2 runs on virtual machines (we use VMWare ESXi 5.1). Both physical and virtual
servers, however, have the same amount of resources available (cpu/memory/disks etc). All
boxes have 16G of RAM. MAX_HEAP_SIZE is 8G and HEAP_NEWSIZE is 600M. We use the default CMS;
we haven't switched to G1. 

Observations:

1. Number of GC pauses on DC2 appear to be significantly more frequent. On DC1 they rarely
exceed the 200ms threshold that makes them appear in the logs. To give you some numbers:

DC1: 
    node1: 2 pauses logged over the past 4 days
    node2: 5
    node3: 2
    node4: 9

DC2:
    node1: 2475 pauses logged over the past 4 days
    node2: 3478
    node3: 3817
    node4: 2472

GC pause duration varies; for DC1 is around 400ms, for DC2 the average is about 400ms as well,
but there are several which exceed 1 or even 2 seconds.

DC1 graphs:

[DC1 - node1 graph]:



[DC1 - node2 graph]:


[DC1 - node3 graph]:


[DC1 - node4 graph]:



DC2 graphs:

[DC2 - node1 graph]:


[DC - node2 graph]:


[DC2 - node3 graph]:


[DC2 - node4 graph]:



The low utilisation of the survivor spaces (presented as a "gap" in the graphs above, cassandra03
graph for example) correlates with compaction activity on the same box:



2. ~10 *Data.db files per KS on DC1 nodes, ~15 *Data.db files per KS on DC2 nodes

3. We are aware of CASSANDRA-9662 (thanks to this list!), but another observation is that
our monitoring system on DC2 seems to be reporting thousands of compactions more frequently:

DC1 Compaction Activity:




DC2 Compaction Activity:



Questions:

1. I am not certain that Heap utilisation as seen from the graphs is healthy. I'd appreciate
your input on this. In additio, is it normal for DC2 to have such a huge difference in GC
activity in comparison to DC1? Also, I'm not sure if the low utilisation of the survivor spaces
as seen above is expected or not during GC activity. How do these two things relate (GC -
compactions)? What makes old generation keep on increasing when the survivors are underutilised
(~ 5%)?

2. Compaction activity (as in frequency of compactions, not number of...) seems to be comparable
for both DCs (with the exception of point 3 above), so I'm not sure why the number of Data.db
files is consistently higher in DC2. Is this something important or it's a minor detail I
shouldn't care about?

3. I'm not sure I have a question regarding observation #3, because I'm going to upgrade to
2.0.17 (at least), but I just included it here in case it helps with the first two observations.

Thanks in advance for any help!




Mime
View raw message