cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Carlos Rolo <r...@pythian.com>
Subject Re: cassandra OOM
Date Tue, 25 Apr 2017 17:56:21 GMT
To add some contribution to this thread, we have seen both cases. CMS
easily outperforming G1 for the same Heapsize and the inverse too. On the
same cluster different workloads (datacenter based) we have both collectors
because of performance based on the workload.

It would be good to colect this information out and do a talk/blog, but for
a later time.

Regards,

Carlos Juzarte Rolo
Cassandra Consultant / Datastax Certified Architect / Cassandra MVP

Pythian - Love your data

rolo@pythian | Twitter: @cjrolo | Skype: cjr2k3 | Linkedin:
*linkedin.com/in/carlosjuzarterolo
<http://linkedin.com/in/carlosjuzarterolo>*
Mobile: +351 918 918 100
www.pythian.com

On Tue, Apr 25, 2017 at 6:47 PM, Durity, Sean R <SEAN_R_DURITY@homedepot.com
> wrote:

> We have seen much better stability (and MUCH less GC pauses) from G1 with
> a variety of heap sizes. I don’t even consider CMS any more.
>
>
>
>
>
> Sean Durity
>
>
>
> *From:* Gopal, Dhruva [mailto:Dhruva.Gopal@Aspect.com]
> *Sent:* Tuesday, April 04, 2017 5:34 PM
> *To:* user@cassandra.apache.org
> *Subject:* Re: cassandra OOM
>
>
>
> Thanks, that’s interesting – so CMS is a better option for
> stability/performance? We’ll try this out in our cluster.
>
>
>
> *From: *Alexander Dejanovski <alex@thelastpickle.com>
> *Reply-To: *"user@cassandra.apache.org" <user@cassandra.apache.org>
> *Date: *Monday, April 3, 2017 at 10:31 PM
> *To: *"user@cassandra.apache.org" <user@cassandra.apache.org>
> *Subject: *Re: cassandra OOM
>
>
>
> Hi,
>
>
>
> we've seen G1GC going OOM on production clusters (repeatedly) with a 16GB
> heap when the workload is intense, and given you're running on m4.2xl I
> wouldn't go over 16GB for the heap.
>
>
>
> I'd suggest to revert back to CMS, using a 16GB heap and up to 6GB of new
> gen. You can use 5 as MaxTenuringThreshold as an initial value and activate
> GC logging to fine tune the settings afterwards.
>
>
>
> FYI CMS tends to perform better than G1 even though it's a little bit
> harder to tune.
>
>
>
> Cheers,
>
>
>
> On Mon, Apr 3, 2017 at 10:54 PM Gopal, Dhruva <Dhruva.Gopal@aspect.com>
> wrote:
>
> 16 Gig heap, with G1. Pertinent info from jvm.options below (we’re using
> m2.2xlarge instances in AWS):
>
>
>
>
>
> #################
>
> # HEAP SETTINGS #
>
> #################
>
>
>
> # Heap size is automatically calculated by cassandra-env based on this
>
> # formula: max(min(1/2 ram, 1024MB), min(1/4 ram, 8GB))
>
> # That is:
>
> # - calculate 1/2 ram and cap to 1024MB
>
> # - calculate 1/4 ram and cap to 8192MB
>
> # - pick the max
>
> #
>
> # For production use you may wish to adjust this for your environment.
>
> # If that's the case, uncomment the -Xmx and Xms options below to override
> the
>
> # automatic calculation of JVM heap memory.
>
> #
>
> # It is recommended to set min (-Xms) and max (-Xmx) heap sizes to
>
> # the same value to avoid stop-the-world GC pauses during resize, and
>
> # so that we can lock the heap in memory on startup to prevent any
>
> # of it from being swapped out.
>
> -Xms16G
>
> -Xmx16G
>
>
>
> # Young generation size is automatically calculated by cassandra-env
>
> # based on this formula: min(100 * num_cores, 1/4 * heap size)
>
> #
>
> # The main trade-off for the young generation is that the larger it
>
> # is, the longer GC pause times will be. The shorter it is, the more
>
> # expensive GC will be (usually).
>
> #
>
> # It is not recommended to set the young generation size if using the
>
> # G1 GC, since that will override the target pause-time goal.
>
> # More info: http://www.oracle.com/technetwork/articles/java/
> g1gc-1984535.html
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.oracle.com_technetwork_articles_java_g1gc-2D1984535.html&d=DwMGaQ&c=MtgQEAMQGqekjTjiAhkudQ&r=aC_gxC6z_4f9GLlbWiKzHm1vucZTtVYWDDvyLkh8IaQ&m=sW03C2XjzKcalSLXhtI4w0y-hPFk4-Nmh4BIt46jHxk&s=xuMqARzoTSasEmAPkP7fVOcPZS050fy1N2_Ac5poOtA&e=>
>
> #
>
> # The example below assumes a modern 8-core+ machine for decent
>
> # times. If in doubt, and if you do not particularly want to tweak, go
>
> # 100 MB per physical CPU core.
>
> #-Xmn800M
>
>
>
> #################
>
> #  GC SETTINGS  #
>
> #################
>
>
>
> ### CMS Settings
>
>
>
> #-XX:+UseParNewGC
>
> #-XX:+UseConcMarkSweepGC
>
> #-XX:+CMSParallelRemarkEnabled
>
> #-XX:SurvivorRatio=8
>
> #-XX:MaxTenuringThreshold=1
>
> #-XX:CMSInitiatingOccupancyFraction=75
>
> #-XX:+UseCMSInitiatingOccupancyOnly
>
> #-XX:CMSWaitDuration=10000
>
> #-XX:+CMSParallelInitialMarkEnabled
>
> #-XX:+CMSEdenChunksRecordAlways
>
> # some JVMs will fill up their heap when accessed via JMX, see
> CASSANDRA-6541
>
> #-XX:+CMSClassUnloadingEnabled
>
>
>
> ### G1 Settings (experimental, comment previous section and uncomment
> section below to enable)
>
>
>
> ## Use the Hotspot garbage-first collector.
>
> -XX:+UseG1GC
>
> #
>
> ## Have the JVM do less remembered set work during STW, instead
>
> ## preferring concurrent GC. Reduces p99.9 latency.
>
> -XX:G1RSetUpdatingPauseTimePercent=5
>
> #
>
> ## Main G1GC tunable: lowering the pause target will lower throughput and
> vise versa.
>
> ## 200ms is the JVM default and lowest viable setting
>
> ## 1000ms increases throughput. Keep it smaller than the timeouts in
> cassandra.yaml.
>
> -XX:MaxGCPauseMillis=500
>
>
>
> ## Optional G1 Settings
>
>
>
> # Save CPU time on large (>= 16GB) heaps by delaying region scanning
>
> # until the heap is 70% full. The default in Hotspot 8u40 is 40%.
>
> -XX:InitiatingHeapOccupancyPercent=70
>
>
>
> # For systems with > 8 cores, the default ParallelGCThreads is 5/8 the
> number of logical cores.
>
> # Otherwise equal to the number of cores when 8 or less.
>
> # Machines with > 10 cores should try setting these to <= full cores.
>
> #-XX:ParallelGCThreads=16
>
> # By default, ConcGCThreads is 1/4 of ParallelGCThreads.
>
> # Setting both to the same value can reduce STW durations.
>
> #-XX:ConcGCThreads=16
>
>
>
> ### GC logging options -- uncomment to enable
>
>
>
> #-XX:+PrintGCDetails
>
> #-XX:+PrintGCDateStamps
>
> #-XX:+PrintHeapAtGC
>
> #-XX:+PrintTenuringDistribution
>
> #-XX:+PrintGCApplicationStoppedTime
>
> #-XX:+PrintPromotionFailure
>
> #-XX:PrintFLSStatistics=1
>
> #-Xloggc:/var/log/cassandra/gc.log
>
> #-XX:+UseGCLogFileRotation
>
> #-XX:NumberOfGCLogFiles=10
>
> #-XX:GCLogFileSize=10M
>
>
>
>
>
> *From: *Alexander Dejanovski <alex@thelastpickle.com>
> *Reply-To: *"user@cassandra.apache.org" <user@cassandra.apache.org>
> *Date: *Monday, April 3, 2017 at 8:00 AM
> *To: *"user@cassandra.apache.org" <user@cassandra.apache.org>
> *Subject: *Re: cassandra OOM
>
>
>
> Hi,
>
>
>
> could you share your GC settings ? G1 or CMS ? Heap size, etc...
>
>
>
> Thanks,
>
>
>
> On Sun, Apr 2, 2017 at 10:30 PM Gopal, Dhruva <Dhruva.Gopal@aspect.com>
> wrote:
>
> Hi –
>
>   We’ve had what looks like an OOM situation with Cassandra (we have a
> dump file that got generated) in our staging (performance/load testing
> environment) and I wanted to reach out to this user group to see if you had
> any recommendations on how we should approach our investigation as to the
> cause of this issue. The logs don’t seem to point to any obvious issues,
> and we’re no experts in analyzing this by any means, so was looking for
> guidance on how to proceed. Should we enter a Jira as well? We’re on
> Cassandra 3.9, and are running  a six node cluster. This happened in a
> controlled load testing environment. Feedback will be much appreciated!
>
>
>
>
>
> Regards,
>
> Dhruva
>
>
>
> This email (including any attachments) is proprietary to Aspect Software,
> Inc. and may contain information that is confidential. If you have received
> this message in error, please do not read, copy or forward this message.
> Please notify the sender immediately, delete it from your system and
> destroy any copies. You may not further disclose or distribute this email
> or its attachments.
>
> --
>
> -----------------
>
> Alexander Dejanovski
>
> France
>
> @alexanderdeja
>
>
>
> Consultant
>
> Apache Cassandra Consulting
>
> http://www.thelastpickle.com
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.thelastpickle.com_&d=DwMGaQ&c=MtgQEAMQGqekjTjiAhkudQ&r=aC_gxC6z_4f9GLlbWiKzHm1vucZTtVYWDDvyLkh8IaQ&m=sW03C2XjzKcalSLXhtI4w0y-hPFk4-Nmh4BIt46jHxk&s=M6B1E7bxftbHbnXYgGIFw4gBuwVBsK13Ha6sZHWkpFE&e=>
>
> This email (including any attachments) is proprietary to Aspect Software,
> Inc. and may contain information that is confidential. If you have received
> this message in error, please do not read, copy or forward this message.
> Please notify the sender immediately, delete it from your system and
> destroy any copies. You may not further disclose or distribute this email
> or its attachments.
>
> --
>
> -----------------
>
> Alexander Dejanovski
>
> France
>
> @alexanderdeja
>
>
>
> Consultant
>
> Apache Cassandra Consulting
>
> http://www.thelastpickle.com
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.thelastpickle.com_&d=DwMGaQ&c=MtgQEAMQGqekjTjiAhkudQ&r=aC_gxC6z_4f9GLlbWiKzHm1vucZTtVYWDDvyLkh8IaQ&m=sW03C2XjzKcalSLXhtI4w0y-hPFk4-Nmh4BIt46jHxk&s=M6B1E7bxftbHbnXYgGIFw4gBuwVBsK13Ha6sZHWkpFE&e=>
>
> This email (including any attachments) is proprietary to Aspect Software,
> Inc. and may contain information that is confidential. If you have received
> this message in error, please do not read, copy or forward this message.
> Please notify the sender immediately, delete it from your system and
> destroy any copies. You may not further disclose or distribute this email
> or its attachments.
>
> ------------------------------
>
> The information in this Internet Email is confidential and may be legally
> privileged. It is intended solely for the addressee. Access to this Email
> by anyone else is unauthorized. If you are not the intended recipient, any
> disclosure, copying, distribution or any action taken or omitted to be
> taken in reliance on it, is prohibited and may be unlawful. When addressed
> to our clients any opinions or advice contained in this Email are subject
> to the terms and conditions expressed in any applicable governing The Home
> Depot terms of business or client engagement letter. The Home Depot
> disclaims all responsibility and liability for the accuracy and content of
> this attachment and for any damages or losses arising from any
> inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other
> items of a destructive nature, which may be contained in this attachment
> and shall not be liable for direct, indirect, consequential or special
> damages in connection with this e-mail message or its attachment.
>

-- 


--




Mime
View raw message