lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mads Tomasgård Bjørgan <...@dips.no>
Subject RE: Memory issues when indexing
Date Tue, 05 Jul 2016 13:23:06 GMT
Another update:

After creating a new certificate, properly specified for its use of context, do we still end
up in the described situation. Thus, it seems SSL itself is the underlying reason for the
leak - 

-----Original Message-----
From: Mads Tomasgård Bjørgan [mailto:mtb@dips.no] 
Sent: tirsdag 5. juli 2016 10.36
To: solr-user@lucene.apache.org
Subject: RE: Memory issues when indexing

Hi again,
We turned off SSL - and now everything works as normal.

The certificate is not originally meant for being used on the current servers- but we would
like to keep it as the certificate has been deployed already and used by our customers. Thus
we need to launch the cloud with "-Dsolr.ssl.checkPeerName=false" - but it seems quite obvious
that the nodes still can't communicate properly.

Our last resort is to replace the certificate - so the questions is now whether it is possible
to tweak the configuration so that we can deploy the configuration so that we can deploy a
SolrCloud with the same certificate.

Thanks,
Mads

From: Mads Tomasgård Bjørgan [mailto:mtb@dips.no]
Sent: tirsdag 5. juli 2016 09.46
To: solr-user@lucene.apache.org
Subject: Memory issues when indexing

Hello,
We're struggling with memory-issues when posting documents to Solr - and unsure for which
reason the problem occurs.

The documents are indexed in a SolrCloud running Solr 6.1.0 on top of Zookeeper 3.4.8, utilizing
three VMs running CentOS 7 and JRE 1.8.0.

After various attempts with different configurations the heap always got full on one, and
only one, of the machines (let's call this machine 1) - and in the end yielding the following
exception:
(....) o.a.s.s.HttpSolrCall null:org.apache.solr.update.processor.DistributedUpdateProcessor$DistributedUpdatesAsyncException:
Async exception during distributed update: Cannot assign requested address The remaining two
machines always has a lot of free memory compared with machine 1.

Thus, we decided to only index a small fraction of the documents to see whether the exception
was due to memory limitations or not. We stopped the indexation when the memory of machine
1 reached 2,5GB of a total of 4GB. As seen on the picture from JConsole was machine 2 only
using 1,4GB of the available memory at the same time (same goes for machine 3). The indexation
stopped - and both machine 2 and 3 had most of their memory emptied when performing a Garbage
Collection. However - machine 1 was unaffected, and very little memory was freed which means
Solr still used around 2,5GB of the memory. I would assume the memory of machine 1 would be
emptied in a similar manner as with machine 2 and 3 as the indexation was stopped. Most of
the memory belonged to the memory pool of "CMS Old Gen"  (well above 2GB).

Indexing until the memory is full for machine 1 gives a count of 50 000 in "File Descriptor
Count" - while the number of files in the index folder is around 150 for each node. I was
told that the number of files in the index folder and the file descriptor count should be
matching? Machine 1 has an enormous amount of TCP-connections stalling at CLOSE_WAIT - while
machine 2 and 3 doesn't have their respective FIN_WAITs even tough machine 1 has almost all
of his TCP-connections pointing at those machines.

[cid:image001.png@01D1D6A1.C6B71120][cid:image002.png@01D1D6A1.C6B71120]
JConsole pictures for machine 1 and 2, respectively. At 08:45 did we resume indexation - the
same exception as shown above was given around 08:52. Machine 2 cleans most of the memory
at GC - in contrast to machine 1.


We have no idea whether this is a bug or fault in the configuration - and was hoping someone
could provide aid to our problem.

Greetings,
Mads

Mime
View raw message