lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rick Dig <teram...@gmail.com>
Subject Re: SolrClould 6.6 stability challenges
Date Sat, 04 Nov 2017 13:25:55 GMT
not committing after the batch. made sure we have that turned off.
maxTime is set to 300000 (300 seconds), openSearcher is set to true.


On Sat, Nov 4, 2017 at 6:50 PM, Amrit Sarkar <sarkaramrit2@gmail.com> wrote:

> Pretty much what Emir has stated. I want to know, when you saw;
>
> all of this runs perfectly ok when indexing isn't happening. as soon as
> > we start "nrt" indexing one of the follower nodes goes down within 10 to
> 20
> > minutes.
>
>
> When you say "NRT" indexing, what is the commit strategy in indexing. With
> auto-commit so highly set, are you committing after batch, if yes, what's
> the number.
>
> Amrit Sarkar
> Search Engineer
> Lucidworks, Inc.
> 415-589-9269
> www.lucidworks.com
> Twitter http://twitter.com/lucidworks
> LinkedIn: https://www.linkedin.com/in/sarkaramrit2
>
> On Sat, Nov 4, 2017 at 2:47 PM, Emir Arnautović <
> emir.arnautovic@sematext.com> wrote:
>
> > Hi Rick,
> > Do you see any errors in logs? Do you have any monitoring tool? Maybe you
> > can check heap and GC metrics around time when incident happened. It is
> not
> > large heap but some major GC could cause pause large enough to trigger
> some
> > snowball and end up with node in recovery state.
> > What is indexing rate you observe? Why do you have max warming searchers
> 5
> > (did you mean this with autowarmingsearchers?) when you commit every 5
> min?
> > Why did you increase it - you seen errors with default 2? Maybe you
> commit
> > every bulk?
> > Do you see similar behaviour when you just do indexing without queries?
> >
> > Thanks,
> > Emir
> > --
> > Monitoring - Log Management - Alerting - Anomaly Detection
> > Solr & Elasticsearch Consulting Support Training - http://sematext.com/
> >
> >
> >
> > > On 4 Nov 2017, at 05:15, Rick Dig <teramera@gmail.com> wrote:
> > >
> > > hello all,
> > > we are trying to run solrcloud 6.6 in a production setting.
> > > here's our config and issue
> > > 1) 3 nodes, 1 shard, replication factor 3
> > > 2) all nodes are 16GB RAM, 4 core
> > > 3) Our production load is about 2000 requests per minute
> > > 4) index is fairly small, index size is around 400 MB with 300k
> documents
> > > 5) autocommit is currently set to 5 minutes (even though ideally we
> would
> > > like a smaller interval).
> > > 6) the jvm runs with 8 gb Xms and Xmx with CMS gc.
> > > 7) all of this runs perfectly ok when indexing isn't happening. as soon
> > as
> > > we start "nrt" indexing one of the follower nodes goes down within 10
> to
> > 20
> > > minutes. from this point on the nodes never recover unless we stop
> > > indexing.  the master usually is the last one to fall.
> > > 8) there are maybe 5 to 7 processes indexing at the same time with
> > document
> > > batch sizes of 500.
> > > 9) maxRambuffersizeMB is 100, autowarmingsearchers is 5,
> > > 10) no cpu and / or oom issues that we can see.
> > > 11) cpu load does go fairly high 15 to 20 at times.
> > > any help or pointers appreciated
> > >
> > > thanks
> > > rick
> >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message