lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joe Obernberger <joseph.obernber...@gmail.com>
Subject Solr 6.6.0 - High CPU during indexing
Date Fri, 18 Aug 2017 16:37:26 GMT
Indexing about 15 million documents per day across 100 shards on 45 
servers.  Up until about 350 million documents, each of the solr 
instances was taking up about 1 core (100% CPU).  Recently, they all 
jumped to 700%.  Is this normal?  Anything that I can check for?

I don't see anything unusual in the solr logs.  Sample from the GC logs:

---------------

2017-08-18 11:53:15 GC log file created /opt/solr6/server/logs/solr_gc.log.2
OpenJDK 64-Bit Server VM (25.141-b16) for linux-amd64 JRE 
(1.8.0_141-b16), built on Jul 20 2017 11:14:57 by "mockbuild" with gcc 
4.4.7 20120313 (Red Hat 4.4.7-18)
Memory: 4k page, physical 99016188k(796940k free), swap 
33554428k(32614048k free)
CommandLine flags: -XX:+AggressiveOpts -XX:CICompilerCount=12 
-XX:ConcGCThreads=4 -XX:G1HeapRegionSize=16777216 
-XX:GCLogFileSize=20971520 -XX:InitialHeapSize=17179869184 
-XX:InitiatingHeapOccupancyPercent=75 -XX:MarkStackSize=4194304 
-XX:MaxDirectMemorySize=3221225472 -XX:MaxGCPauseMillis=300 
-XX:MaxHeapSize=30064771072 -XX:MaxNewSize=18035507200 
-XX:MinHeapDeltaBytes=16777216 -XX:NumberOfGCLogFiles=9 
-XX:OnOutOfMemoryError=/opt/solr6/bin/oom_solr.sh 9100 
/opt/solr6/server/logs -XX:ParallelGCThreads=16 
-XX:+ParallelRefProcEnabled -XX:+PerfDisableSharedMem -XX:+PrintGC 
-XX:+PrintGCApplicationStoppedTime -XX:+PrintGCDateStamps 
-XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintHeapAtGC 
-XX:+PrintTenuringDistribution -XX:-ResizePLAB -XX:ThreadStackSize=256 
-XX:+UseCompressedClassPointers -XX:+UseCompressedOops -XX:+UseG1GC 
-XX:+UseGCLogFileRotation -XX:+UseLargePages
{Heap before GC invocations=559440 (full 0):
  garbage-first heap   total 29360128K, used 24944705K 
[0x00000000c0000000, 0x00000000c1003800, 0x00000007c0000000)
   region size 16384K, 1075 young (17612800K), 13 survivors (212992K)
  Metaspace       used 95460K, capacity 97248K, committed 97744K, 
reserved 1134592K
   class space    used 11616K, capacity 12104K, committed 12240K, 
reserved 1048576K
2017-08-18T11:53:15.985-0400: 522594.835: [GC pause (G1 Evacuation 
Pause) (young)
Desired survivor size 1132462080 bytes, new threshold 15 (max 15)
- age   1:   23419920 bytes,   23419920 total
- age   2:    9355296 bytes,   32775216 total
- age   3:    2455384 bytes,   35230600 total
- age   4:   38246704 bytes,   73477304 total
- age   5:   47064408 bytes,  120541712 total
- age   6:   13228864 bytes,  133770576 total
- age   7:   23990800 bytes,  157761376 total
- age   8:    1031416 bytes,  158792792 total
- age   9:   17011128 bytes,  175803920 total
- age  10:    7371888 bytes,  183175808 total
- age  11:    6226576 bytes,  189402384 total
- age  12:     637184 bytes,  190039568 total
- age  13:   11577864 bytes,  201617432 total
- age  14:    9519224 bytes,  211136656 total
- age  15:     672304 bytes,  211808960 total
, 0.0391210 secs]
    [Parallel Time: 32.1 ms, GC Workers: 16]
       [GC Worker Start (ms): Min: 522594835.0, Avg: 522594835.1, Max: 
522594835.2, Diff: 0.2]
       [Ext Root Scanning (ms): Min: 0.5, Avg: 0.8, Max: 2.2, Diff: 1.7, 
Sum: 12.2]
       [Update RS (ms): Min: 0.9, Avg: 2.3, Max: 3.2, Diff: 2.2, Sum: 36.6]
          [Processed Buffers: Min: 3, Avg: 4.7, Max: 8, Diff: 5, Sum: 75]
       [Scan RS (ms): Min: 0.1, Avg: 0.2, Max: 0.4, Diff: 0.3, Sum: 3.0]
       [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 
0.0, Sum: 0.1]
       [Object Copy (ms): Min: 27.7, Avg: 28.3, Max: 28.6, Diff: 0.8, 
Sum: 453.5]
       [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1]
          [Termination Attempts: Min: 1, Avg: 1.3, Max: 2, Diff: 1, Sum: 21]
       [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.3, Diff: 0.3, 
Sum: 2.4]
       [GC Worker Total (ms): Min: 31.6, Avg: 31.7, Max: 32.0, Diff: 
0.4, Sum: 507.9]
       [GC Worker End (ms): Min: 522594866.7, Avg: 522594866.8, Max: 
522594867.0, Diff: 0.2]
    [Code Root Fixup: 0.1 ms]
    [Code Root Purge: 0.0 ms]
    [Clear CT: 1.7 ms]
    [Other: 5.2 ms]
       [Choose CSet: 0.0 ms]
       [Ref Proc: 2.9 ms]
       [Ref Enq: 0.1 ms]
       [Redirty Cards: 0.2 ms]
       [Humongous Register: 0.1 ms]
       [Humongous Reclaim: 0.0 ms]
       [Free CSet: 1.6 ms]
    [Eden: 16.6G(16.6G)->0.0B(16.6G) Survivors: 208.0M->208.0M Heap: 
23.8G(28.0G)->7371.4M(28.0G)]
Heap after GC invocations=559441 (full 0):
  garbage-first heap   total 29360128K, used 7548353K 
[0x00000000c0000000, 0x00000000c1003800, 0x00000007c0000000)
   region size 16384K, 13 young (212992K), 13 survivors (212992K)
  Metaspace       used 95460K, capacity 97248K, committed 97744K, 
reserved 1134592K
   class space    used 11616K, capacity 12104K, committed 12240K, 
reserved 1048576K
}
  [Times: user=0.54 sys=0.00, real=0.04 secs]
2017-08-18T11:53:16.024-0400: 522594.874: Total time for which 
application threads were stopped: 0.0471187 seconds, Stopping threads 
took: 0.0001739 seconds
2017-08-18T11:53:16.811-0400: 522595.661: Total time for which 
application threads were stopped: 0.0019163 seconds, Stopping threads 
took: 0.0001631 seconds
2017-08-18T11:53:17.210-0400: 522596.060: Total time for which 
application threads were stopped: 0.0010934 seconds, Stopping threads 
took: 0.0001659 seconds

----------------

-Joe


Mime
View raw message