lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shawn Heisey <apa...@elyograg.org>
Subject Re: solr reads whole index on startup
Date Mon, 10 Dec 2018 17:37:33 GMT
On 12/7/2018 8:54 AM, Erick Erickson wrote:
> Here's the trap:_Indexing_  doesn't take much memory. The memory
> is bounded
> by ramBufferSizeMB, which defaults to 100.

This statement is completely true.  But it hides one detail:  A large 
amount of indexing will allocate this buffer repeatedly.  So although 
indexing doesn't take a huge amount of memory space at any given moment, 
the amount of total memory allocated by large indexing will be enormous, 
keeping the garbage collector busy.  This is particularly true when 
segment merging happens.

Going over the whole thread:

Graceful shutdown on Solr 7.5 (for non-Windows operating systems) should 
allow up to three minutes for Solr to shut down normally before it 
hard-kills the instance.  On Windows it only waits 5 seconds, which is 
not enough.  What OS is it on?

The problems you've described do sound like your Solr instances are 
experiencing massive GC pauses.  This can make *ALL* Solr activity take 
a long time, including index recovery operations.  Increasing the heap 
size MIGHT alleviate these problems.

If every machine is handling 700GB of index data and 1.4 billion docs 
(assuming one third of the 2.1 billion docs per shard replica, two 
replicas per machine), you're going to need a lot of heap memory for 
Solr to run well.  With your indexes in HDFS, the HDFS software running 
inside Solr also needs heap memory to operate, and is probably going to 
set aside part of the heap for caching purposes.  I thought I saw 
something in the thread about a 6GB heap size.  This is probably way too 
small.   For everything you've described, I have to agree with Erick ... 
16GB total memory is VERY undersized.  It's likely unrealistic to have 
enough memory for the whole index ... but for this setup, I'd definitely 
want a LOT more than 16GB.

As Solr runs, it writes a GC log.  Can you share all of the GC log files 
that Solr has created?  There should not be any proprietary information 
in those files.

Thanks,
Shawn


Mime
View raw message