lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Markus Jelsma <markus.jel...@openindex.io>
Subject RE: 6.6 cloud starting to eat CPU after 8+ hours
Date Thu, 20 Jul 2017 12:51:29 GMT
cc mailinglist

Hello,

I thought that would come to your mind but do not worry, the heap averages at 55 % all day
long, there is very little garbage collection going on, and if so, it is the eden space that
gets collected. If you really want, i can send such a file when the problem occurs again,
but even at those moments, GC is minimal and the heap stays at about 55 - 60 % and only peaks
every 15 minutes when documents are indexed.

Thanks,
Markus
 
-----Original message-----
> From:Shawn Heisey <apache@elyograg.org>
> Sent: Wednesday 19th July 2017 16:08
> To: Markus Jelsma <markus.jelsma@openindex.io>
> Subject: Re: 6.6 cloud starting to eat CPU after 8+ hours
> 
> On 7/19/2017 3:35 AM, Markus Jelsma wrote:
> > Another peculiarity here, our six node (2 shards / 3 replica's) cluster is going
crazy after a good part of the day has passed. It starts eating CPU for no good reason and
its latency goes up. Grafana graphs show the problem really well
> >
> > After restarting 2/6 nodes, there is also quite a distinction in the VisualVM monitor
views, and the VisualVM CPU sampler reports (sorted on self time (CPU)). The busy nodes are
deeply red in o.a.h.impl.io.AbstractSessionInputBuffer.fillBuffer (as usual), the restarted
nodes are not.
> >
> > The real distinction between busy and calm nodes is that busy nodes all have o.a.l.codecs.perfield.PerFieldPostingsFormat$FieldsReader.terms()
as second to fillBuffer(), what are they doing?! Why? The calm nodes don't show this at all.
Busy nodes all have o.a.l.codec stuff on top, restarted nodes don't.
> >
> > So, actually, i don't have a clue! Any, any ideas? 
> >
> > Thanks,
> > Markus
> >
> > Each replica is underpowered but performing really well after restart (and JVM warmup),
4 CPU's, 900M heap, 8 GB RAM, maxDoc 2.8 million, index size 18 GB.
> 
> A 900MB heap seems very small for an 18GB index with millions of
> documents.  The first thing I would suspect is that the heap is running
> very near the maximum and the JVM is spending a lot of time doing
> garbage collection.  Can you share the gc.log file from an instance that
> is running the high CPU so this  can be checked?  I'd also be interested
> in seeing solrconfig.xml.
> 
> Thanks,
> Shawn
> 
> 

Mime
View raw message