lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Walter Underwood <wun...@wunderwood.org>
Subject Re: Solr OOM Crashes / JVM tuning advice
Date Wed, 11 Apr 2018 14:49:21 GMT
For readability, I’d use -Xmx12G instead of -XX:MaxHeapSize=12884901888. Also, I always use
a start size the same as the max size, since servers will eventually grow to the max size.
So:

-Xmx12G -Xms12G

wunder
Walter Underwood
wunder@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Apr 11, 2018, at 6:29 AM, Sujay Bawaskar <sujaybawaskar@gmail.com> wrote:
> 
> What is directory factory defined in solrconfig.xml? Your JVM heap should
> be tuned up with respect to that.
> How solr is being use,  is it more updates and less query or less updates
> more queries?
> What is OOM error? Is it frequent GC or Error 12?
> 
> On Wed, Apr 11, 2018 at 6:05 PM, Adam Harrison-Fuller <
> aharrison-fuller@mintel.com> wrote:
> 
>> Hey Jesus,
>> 
>> Thanks for the suggestions.  The Solr nodes have 4 CPUs assigned to them.
>> 
>> Cheers!
>> Adam
>> 
>> On 11 April 2018 at 11:22, Jesus Olivan <jesus.olivan@letgo.com> wrote:
>> 
>>> Hi Adam,
>>> 
>>> IMHO you could try increasing heap to 20 Gb (with 46 Gb of physical RAM,
>>> your JVM can afford more RAM without threading penalties due to outside
>>> heap RAM lacks.
>>> 
>>> Another good one would be to increase -XX:CMSInitiatingOccupancyFraction
>>> =50
>>> to 75. I think that CMS collector works better when Old generation space
>> is
>>> more populated.
>>> 
>>> I usually use to set Survivor spaces to lesser size. If you want to try
>>> SurvivorRatio to 6, i think performance would be improved.
>>> 
>>> Another good practice for me would be to set an static NewSize instead
>>> of -XX:NewRatio=3.
>>> You could try to set -XX:NewSize=7000m and -XX:MaxNewSize=7000Mb (one
>> third
>>> of total heap space is recommended).
>>> 
>>> Finally, my best results after a deep JVM I+D related to Solr, came
>>> removing ScavengeBeforeRemark flag and applying this new one: +
>>> ParGCCardsPerStrideChunk.
>>> 
>>> However, It would be a good one to set ParallelGCThreads and
>>> *ConcGCThreads *to their optimal value, and we need you system CPU number
>>> to know it. Can you provide this data, please?
>>> 
>>> Regards
>>> 
>>> 
>>> 2018-04-11 12:01 GMT+02:00 Adam Harrison-Fuller <
>>> aharrison-fuller@mintel.com
>>>> :
>>> 
>>>> Hey all,
>>>> 
>>>> I was wondering if I could get some JVM/GC tuning advice to resolve an
>>>> issue that we are experiencing.
>>>> 
>>>> Full disclaimer, I am in no way a JVM/Solr expert so any advice you can
>>>> render would be greatly appreciated.
>>>> 
>>>> Our Solr cloud nodes are having issues throwing OOM exceptions under
>>> load.
>>>> This issue has only started manifesting itself over the last few months
>>>> during which time the only change I can discern is an increase in index
>>>> size.  They are running Solr 5.5.2 on OpenJDK version "1.8.0_101".  The
>>>> index is currently 58G and the server has 46G of physical RAM and runs
>>>> nothing other than the Solr node.
>>>> 
>>>> The JVM is invoked with the following JVM options:
>>>> -XX:CMSInitiatingOccupancyFraction=50 -XX:CMSMaxAbortablePrecleanTime=
>>> 6000
>>>> -XX:+CMSParallelRemarkEnabled -XX:+CMSScavengeBeforeRemark
>>>> -XX:ConcGCThreads=4 -XX:InitialHeapSize=12884901888
>>> -XX:+ManagementServer
>>>> -XX:MaxHeapSize=12884901888 -XX:MaxTenuringThreshold=8
>>>> -XX:NewRatio=3 -XX:OldPLABSize=16
>>>> -XX:OnOutOfMemoryError=/opt/solr/bin/oom_solr.sh 30000
>>>> /data/gnpd/solr/logs
>>>> -XX:ParallelGCThreads=4
>>>> -XX:+ParallelRefProcEnabled -XX:PretenureSizeThreshold=67108864
>>>> -XX:+PrintGC -XX:+PrintGCApplicationStoppedTime -XX:+PrintGCDateStamps
>>>> -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintHeapAtGC
>>>> -XX:+PrintTenuringDistribution -XX:SurvivorRatio=4
>>>> -XX:TargetSurvivorRatio=90
>>>> -XX:+UseCMSInitiatingOccupancyOnly -XX:+UseCompressedClassPointers
>>>> -XX:+UseCompressedOops -XX:+UseConcMarkSweepGC -XX:+UseParNewGC
>>>> 
>>>> These values were decided upon serveral years by a colleague based upon
>>>> some suggestions from this mailing group with an index size ~25G.
>>>> 
>>>> I have imported the GC logs into GCViewer and attached a link to a
>>>> screenshot showing the lead up to a OOM crash.  Interestingly the young
>>>> generation space is almost empty before the repeated GC's and
>> subsequent
>>>> crash.
>>>> https://imgur.com/a/Wtlez
>>>> 
>>>> I was considering slowly increasing the amount of heap available to the
>>> JVM
>>>> slowly until the crashes, any other suggestions?  I'm looking at trying
>>> to
>>>> get the nodes stable without having issues with the GC taking forever
>> to
>>>> run.
>>>> 
>>>> Additional information can be provided on request.
>>>> 
>>>> Cheers!
>>>> Adam
>>>> 
>>>> --
>>>> 
>>>> Mintel Group Ltd | 11 Pilgrim Street | London | EC4V 6RN
>>>> Registered in
>>>> England: Number 1475918. | VAT Number: GB 232 9342 72
>>>> 
>>>> Contact details for
>>>> our other offices can be found at http://www.mintel.com/office-
>> locations
>>>> <http://www.mintel.com/office-locations>.
>>>> 
>>>> This email and any attachments
>>>> may include content that is confidential, privileged
>>>> or otherwise
>>>> protected under applicable law. Unauthorised disclosure, copying,
>>>> distribution
>>>> or use of the contents is prohibited and may be unlawful. If
>>>> you have received this email in error,
>>>> including without appropriate
>>>> authorisation, then please reply to the sender about the error
>>>> and delete
>>>> this email and any attachments.
>>>> 
>>>> 
>>> 
>> 
>> --
>> 
>> Mintel Group Ltd | 11 Pilgrim Street | London | EC4V 6RN
>> Registered in
>> England: Number 1475918. | VAT Number: GB 232 9342 72
>> 
>> Contact details for
>> our other offices can be found at http://www.mintel.com/office-locations
>> <http://www.mintel.com/office-locations>.
>> 
>> This email and any attachments
>> may include content that is confidential, privileged
>> or otherwise
>> protected under applicable law. Unauthorised disclosure, copying,
>> distribution
>> or use of the contents is prohibited and may be unlawful. If
>> you have received this email in error,
>> including without appropriate
>> authorisation, then please reply to the sender about the error
>> and delete
>> this email and any attachments.
>> 
>> 
> 
> 
> -- 
> Thanks,
> Sujay P Bawaskar
> M:+91-77091 53669


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message