lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From vishal patel <vishalpatel200...@outlook.com>
Subject Re: Shard and replica went down in Solr 6.1.0
Date Sun, 14 Apr 2019 03:29:02 GMT

Thanks for your reply.

2> In production, lots of documents come for indexing within a second.If i do hard commit
interval to 60 seconds then in less times open searchers when hard commit execute. Is it ohk
for performance?
3> soft commit 60 second we can not do as of now because our product implement like NRT
after indexing instant show that changes.

Actually I have more analysed my production issue logs ::
Replica becomes leader and after some times some documents come for indexing in new leader
as a understanding of previously replica.

My production log :

Shard Log(10.102.119.85) :

2019-04-08 12:54:01.405 ERROR (updateExecutor-2-thread-36358-processing-http:////10.101.111.80:8983//solr//product
x:product r:core_node1 n:10.102.119.85:8983_solr s:shard1 c:product) [c:product s:shard1 r:core_node1
x:product] o.a.s.u.StreamingSolrClients error
org.apache.solr.common.SolrException: Service Unavailable
request: http://10.101.111.80:8983/solr/product/update?update.distrib=FROMLEADER&distrib.from=http%3A%2F%2F10.102.119.85%3A8983%2Fsolr%2Fproduct%2F&wt=javabin&version=2
at org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.sendUpdateStream(ConcurrentUpdateSolrClient.java:320)
at org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.run(ConcurrentUpdateSolrClient.java:185)
at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$22(ExecutorUtil.java:229)
at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$$Lambda$3/30175207.run(Unknown
Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)


Replica Log(10.101.111.80) :

2019-04-08 12:52:19.717 INFO  (zkCallback-4-thread-207-processing-n:10.101.111.80:8983_solr)
[   ] o.a.s.c.c.ZkStateReader A live node change: [WatchedEvent state:SyncConnected type:NodeChildrenChanged
path:/live_nodes], has occurred - updating... (live nodes size: [4])
2019-04-08 12:52:19.725 INFO  (zkCallback-4-thread-202-processing-n:10.101.111.80:8983_solr)
[c:product s:shard1 r:core_node3 x:product] o.a.s.c.ShardLeaderElectionContext Running the
leader process for shard=shard1 and weAreReplacement=true and leaderVoteWait=180000
2019-04-08 12:52:19.727 INFO  (zkCallback-4-thread-207-processing-n:10.101.111.80:8983_solr)
[   ] o.a.s.c.c.ZkStateReader Updated live nodes from ZooKeeper... (4) -> (3)
2019-04-08 12:52:19.764 INFO  (zkCallback-4-thread-202-processing-n:10.101.111.80:8983_solr)
[c:product s:shard1 r:core_node3 x:product] o.a.s.c.ShardLeaderElectionContext Checking if
I should try and be the leader.
2019-04-08 12:52:19.765 INFO  (zkCallback-4-thread-202-processing-n:10.101.111.80:8983_solr)
[c:product s:shard1 r:core_node3 x:product] o.a.s.c.ShardLeaderElectionContext My last published
State was Active, it's okay to be the leader.
2019-04-08 12:52:19.766 INFO  (zkCallback-4-thread-202-processing-n:10.101.111.80:8983_solr)
[c:product s:shard1 r:core_node3 x:product] o.a.s.c.ShardLeaderElectionContext I may be the
new leader - try and sync
2019-04-08 12:52:19.767 WARN  (zkCallback-4-thread-202-processing-n:10.101.111.80:8983_solr)
[c:product s:shard1 r:core_node3 x:product] o.a.s.c.RecoveryStrategy Stopping recovery for
core=[product] coreNodeName=[core_node3]
2019-04-08 12:52:20.291 INFO  (zkCallback-4-thread-207-processing-n:10.101.111.80:8983_solr)
[   ] o.a.s.c.c.ZkStateReader A cluster state change: [WatchedEvent state:SyncConnected type:NodeDataChanged
path:/clusterstate.json], has occurred - updating... (live nodes size: [3])
2019-04-08 12:52:20.532 INFO  (zkCallback-4-thread-207-processing-n:10.101.111.80:8983_solr)
[   ] o.a.s.c.c.ZkStateReader A cluster state change: [WatchedEvent state:SyncConnected type:NodeDataChanged
path:/clusterstate.json], has occurred - updating... (live nodes size: [3])
2019-04-08 12:52:22.274 INFO  (zkCallback-4-thread-202-processing-n:10.101.111.80:8983_solr)
[c:product s:shard1 r:core_node3 x:product] o.a.s.c.SyncStrategy Sync replicas to http://10.101.111.80:8983/solr/product/
2019-04-08 12:52:22.274 INFO  (zkCallback-4-thread-202-processing-n:10.101.111.80:8983_solr)
[c:product s:shard1 r:core_node3 x:product] o.a.s.c.SyncStrategy Sync Success - now sync replicas
to me
2019-04-08 12:52:22.274 INFO  (zkCallback-4-thread-202-processing-n:10.101.111.80:8983_solr)
[c:product s:shard1 r:core_node3 x:product] o.a.s.c.SyncStrategy http://10.101.111.80:8983/solr/product/
has no replicas
2019-04-08 12:52:22.275 INFO  (zkCallback-4-thread-202-processing-n:10.101.111.80:8983_solr)
[c:product s:shard1 r:core_node3 x:product] o.a.s.c.ShardLeaderElectionContextBase Creating
leader registration node /collections/product/leaders/shard1/leader after winning as /collections/product/leader_elect/shard1/election/245953614547976270-core_node3-n_0000000001
2019-04-08 12:52:22.384 INFO  (zkCallback-4-thread-202-processing-n:10.101.111.80:8983_solr)
[c:product s:shard1 r:core_node3 x:product] o.a.s.c.ShardLeaderElectionContext I am the new
leader: http://10.101.111.80:8983/solr/product/ shard1

2019-04-08 12:54:01.294 ERROR (qtp1239731077-1022464) [c:product s:shard1 r:core_node3 x:product]
o.a.s.u.p.DistributedUpdateProcessor Request says it is coming from leader, but we are the
leader: update.distrib=FROMLEADER&distrib.from=http://10.102.119.85:8983/solr/product/&wt=javabin&version=2
2019-04-08 12:54:01.294 INFO  (qtp1239731077-1022464) [c:product s:shard1 r:core_node3 x:product]
o.a.s.u.p.LogUpdateProcessorFactory [product]  webapp=/solr path=/update params={update.distrib=FROMLEADER&distrib.from=http://10.102.119.85:8983/solr/product/&wt=javabin&version=2}{}
0 0
2019-04-08 12:54:01.295 ERROR (qtp1239731077-1022464) [c:product s:shard1 r:core_node3 x:product]
o.a.s.h.RequestHandlerBase org.apache.solr.common.SolrException: Request says it is coming
from leader, but we are the leader
at org.apache.solr.update.processor.DistributedUpdateProcessor.doDefensiveChecks(DistributedUpdateProcessor.java:621)
at org.apache.solr.update.processor.DistributedUpdateProcessor.setupRequest(DistributedUpdateProcessor.java:392)
at org.apache.solr.update.processor.DistributedUpdateProcessor.setupRequest(DistributedUpdateProcessor.java:320)
at org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:679)
at org.apache.solr.update.processor.LogUpdateProcessorFactory$LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:103)


First for which reason replica becomes leader? and if replica becomes leader then why after
2 minutes document come in new leader as a behave it is still replica? We do index document
using zoo keeper instead of direct on shard.

Sent from Outlook<http://aka.ms/weboutlook>
________________________________
From: Erick Erickson <erickerickson@gmail.com>
Sent: Friday, April 12, 2019 8:52 PM
To: vishal patel
Subject: Re: Shard and replica went down in Solr 6.1.0

What was unclear about the statement "These config options will _not_ open any new searchers.
Period. “?

You _cannot_ be opening new searchers automatically, some client somewhere _must_ be issuing
a commit. And you shouldn’t do that. Until you find the external source that’s issuing
the commit, you’ll make no progress on this.

Is this an explicit enough recommendation?

1> find the source of the commits and stop doing that.
2> change your hard commit interval to 60 seconds. Keep openSearcher set to false.
3> change your soft commit to 60 seconds (this will imply that you can’t search docs
for a minute).

This may or may not be the root of the “Request says it is coming from the leader” message.
But it’s certainly a contributing factor. We can identify this positively as a bad configuration
so let’s eliminate it and _then_ look for more issues.

Best,
Erick

> On Apr 11, 2019, at 10:36 PM, vishal patel <vishalpatel200928@outlook.com> wrote:
>
> Thanks your reply
>
> Actually all cache are removed in my solrconfig.xml so no matter for autowarm count for
us.
>
> And I read your given link about hard commit and soft commit concept.
> In my production scenario like - index-heavy, Query-heavy [Near Real Time]
> So we set the hardcommit max time 10 minutes and soft commit -1 for each collection.[27
collections in production]
> Is it ohk for production? Please give some input.
>
> And shard is going down for which reason how can we know from log?
>
> And please tell me what is meaning of below error because I can not find this type of
error in any web page.
>
> Shard-1-replica
> -----------------
> 2019-04-08 12:54:01.294 ERROR (qtp1239731077-1022464) [c:product s:shard1 r:core_node3
x:product] o.a.s.u.p.DistributedUpdateProcessor Request says it is coming from leader, but
we are the leader: update.distrib=FROMLEADER&distrib.from=http://10.102.119.85:8983/solr/product/&wt=javabin&version=2
> 2019-04-08 12:54:01.294 INFO  (qtp1239731077-1022464) [c:product s:shard1 r:core_node3
x:product] o.a.s.u.p.LogUpdateProcessorFactory [product]  webapp=/solr path=/update params={update.distrib=FROMLEADER&distrib.from=http://10.102.119.85:8983/solr/product/&wt=javabin&version=2}{}
0 0
> 2019-04-08 12:54:01.295 ERROR (qtp1239731077-1022464) [c:product s:shard1 r:core_node3
x:product] o.a.s.h.RequestHandlerBase org.apache.solr.common.SolrException: Request says it
is coming from leader, but we are the leader
> at
>
> Shard-1
> ------------------
> 2019-04-08 12:54:01.534 ERROR (updateExecutor-2-thread-36358-processing-http:////10.101.111.80:8983//solr//product3
x:product3 r:core_node1 n:10.102.119.85:8983_solr s:shard1 c:product3) [c:product3 s:shard1
r:core_node1 x:product3] o.a.s.u.StreamingSolrClients error
> org.apache.solr.common.SolrException: Service Unavailable
>
>
> First we got error in replica then shard.
>
> Sent from Outlook
> From: Erick Erickson <erickerickson@gmail.com>
> Sent: Friday, April 12, 2019 12:04 AM
> To: vishal patel
> Subject: Re: Shard and replica went down in Solr 6.1.0
>
> Well, it leads to “too many on deck searchers”, obviously.
>
> Here’s most of what you want to know about commits and the like: https://lucidworks.com/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/
>
> I always push back on any setting that opens a new searcher every second. You might as
well set all your autowarm counts in solrconfig.xml to zero, autowarming isn’t doing you
any good. You’re also creating a lot of garbage churn etc.
>
> Set your soft commit to as long as you can. If you can set expectations like “it may
take up to a minute before your changes are visible"
>
> > On Apr 11, 2019, at 10:09 AM, vishal patel <vishalpatel200928@outlook.com>
wrote:
> >
> > Actually in our application , bulky documents are needed to index and on same time
we want to see that documents. So in production we keep auto commit 10 minutes and auto soft
commit 1 second.
> > Is it ohk?
> >
> > Get Outlook for Android
> >
> >
> >
> > From: Erick Erickson <erickerickson@gmail.com>
> > Sent: Thursday, April 11, 2019 10:23:00 PM
> > To: vishal patel
> > Subject: Re: Shard and replica went down in Solr 6.1.0
> >
> > We’re not quite on the same page.
> >
> > These config options will _not_ open any new searchers. Period. They’re not the
source of your max warming searchers error. Therefore _somebody_ is issuing commits. You need
to find out who and stop them.
> >
> > Then change your settings in solrconfig to
> > 1> remove maxDoc from autoCommit. It’s probably leading to useless work
> > 2> I’d shorten my maxTime in autoCommit to, say, 1 minute. This isn’t very
important, but 10 minutes is unnecessary
> > 3> Change your autoSoftCommit to as long as your application can tolerate, say
10 minutes if possible.
> > 4> Find out who is issuing commits and stop them.
> >
> > With these settings, unless you have outrageous “autowarmCount” settings in
solrconfig.xml for the caches, you should not see any overlapping on deck searchers. I usually
start with autowarmCount settings in the 10-16 range.
> >
> > Best,
> > Erick
> >
> > > On Apr 11, 2019, at 5:40 AM, vishal patel <vishalpatel200928@outlook.com>
wrote:
> > >
> > >
> > > Thanks Erick,
> > > I got your point. As per you, Solr will not go down due to “performance warning”
and no need to change the maxdoc value. You talked about number of searchers but in solrconfig.xml
there is only 2.[<maxWarmingSearchers>2</maxWarmingSearchers>]
> > >
> > > In production, we have 27 collection, 2 shards and 2 replicas and 3 zoo keepers
and more than 30000 documents indexed within a 10 minutes.
> > > How can I know how many searchers are open at the specific time?
> > > As per my understanding, solr searcher will open when soft or hard commit.
Am I right? And my commit time below
> > > <autoCommit>
> > > <maxTime>600000</maxTime>
> > >     <maxDocs>20000</maxDocs>
> > >     <openSearcher>false</openSearcher>
> > > </autoCommit>
> > >
> > > <autoSoftCommit>
> > > <maxTime>${solr.autoSoftCommit.maxTime:-1}</maxTime>
> > > </autoSoftCommit>
> > >
> > > I do not write any code for opening the solr index searcher and can not find
any error related that.
> > >
> > > My all collection has same configuration mean same hard and soft commit time
for all 27 collection. is it any issue if on same time two or more collections will come for
hard commit?
> > >
> > > Below, again send more and accurate log details.
> > >
> > > Shard-1 Log ::
> > > --------------
> > > 2019-04-08 12:54:01.395 INFO  (commitScheduler-102-thread-1) [c:collection1
s:shard1 r:core_node1 x:collection1] o.a.s.u.DirectUpdateHandler2 start commit{,optimize=false,openSearcher=false,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}
> > > 2019-04-08 12:54:01.395 INFO  (commitScheduler-118-thread-1) [c:collection2
s:shard1 r:core_node1 x:collection2] o.a.s.u.DirectUpdateHandler2 start commit{,optimize=false,openSearcher=false,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}
> > > 2019-04-08 12:54:01.395 INFO  (commitScheduler-110-thread-1) [c:collection3
s:shard1 r:core_node1 x:collection3] o.a.s.u.DirectUpdateHandler2 start commit{,optimize=false,openSearcher=false,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}
> > > 2019-04-08 12:54:01.394 INFO  (commitScheduler-109-thread-1) [c:collection4
s:shard1 r:core_node1 x:collection4] o.a.s.u.DirectUpdateHandler2 start commit{,optimize=false,openSearcher=false,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}
> > > 2019-04-08 12:54:01.394 INFO  (commitScheduler-99-thread-1) [c:collection5
s:shard1 r:core_node1 x:collection5] o.a.s.u.DirectUpdateHandler2 start commit{,optimize=false,openSearcher=false,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}
> > >
> > > 2019-04-08 12:54:01.405 ERROR (updateExecutor-2-thread-36358-processing-http:////10.101.111.80:8983//solr//product
x:product r:core_node1 n:10.102.119.85:8983_solr s:shard1 c:product) [c:product s:shard1 r:core_node1
x:product] o.a.s.u.StreamingSolrClients error
> > > org.apache.solr.common.SolrException: Service Unavailable
> > >
> > >
> > >
> > > request: http://10.101.111.80:8983/solr/product/update?update.distrib=FROMLEADER&distrib.from=http%3A%2F%2F10.102.119.85%3A8983%2Fsolr%2Fproduct%2F&wt=javabin&version=2
> > > at org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.sendUpdateStream(ConcurrentUpdateSolrClient.java:320)
> > > at org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.run(ConcurrentUpdateSolrClient.java:185)
> > > at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$22(ExecutorUtil.java:229)
> > > at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$$Lambda$3/30175207.run(Unknown
Source)
> > > at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> > > at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> > > at java.lang.Thread.run(Thread.java:745)
> > >
> > >
> > > Shard-1-replica-1 Log ::
> > > --------------
> > > 2019-04-08 12:54:01.294 ERROR (qtp1239731077-1022464) [c:product s:shard1 r:core_node3
x:product] o.a.s.u.p.DistributedUpdateProcessor Request says it is coming from leader, but
we are the leader: update.distrib=FROMLEADER&distrib.from=http://10.102.119.85:8983/solr/product/&wt=javabin&version=2
> > > 2019-04-08 12:54:01.294 INFO  (qtp1239731077-1022464) [c:product s:shard1 r:core_node3
x:product] o.a.s.u.p.LogUpdateProcessorFactory [product]  webapp=/solr path=/update params={update.distrib=FROMLEADER&distrib.from=http://10.102.119.85:8983/solr/product/&wt=javabin&version=2}{}
0 0
> > > 2019-04-08 12:54:01.295 ERROR (qtp1239731077-1022464) [c:product s:shard1 r:core_node3
x:product] o.a.s.h.RequestHandlerBase org.apache.solr.common.SolrException: Request says it
is coming from leader, but we are the leader
> > > at org.apache.solr.update.processor.DistributedUpdateProcessor.doDefensiveChecks(DistributedUpdateProcessor.java:621)
> > > at org.apache.solr.update.processor.DistributedUpdateProcessor.setupRequest(DistributedUpdateProcessor.java:392)
> > > at org.apache.solr.update.processor.DistributedUpdateProcessor.setupRequest(DistributedUpdateProcessor.java:320)
> > > at org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:679)
> > > at org.apache.solr.update.processor.LogUpdateProcessorFactory$LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:103)
> > > at org.apache.solr.handler.loader.JavabinLoader$1.update(JavabinLoader.java:97)
> > > at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readOuterMostDocIterator(JavaBinUpdateRequestCodec.java:179)
> > > at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readIterator(JavaBinUpdateRequestCodec.java:135)
> > > at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:274)
> > > at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readNamedList(JavaBinUpdateRequestCodec.java:121)
> > > at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:239)
> > > at org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:157)
> > > at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec.unmarshal(JavaBinUpdateRequestCodec.java:186)
> > > at org.apache.solr.handler.loader.JavabinLoader.parseAndLoadDocs(JavabinLoader.java:107)
> > > at org.apache.solr.handler.loader.JavabinLoader.load(JavabinLoader.java:54)
> > > at org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:97)
> > > at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:69)
> > > at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:156)
> > > at org.apache.solr.core.SolrCore.execute(SolrCore.java:2036)
> > > at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:657)
> > > at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:464)
> > > at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:257)
> > > at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:208)
> > > at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1668)
> > > at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:581)
> > > at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
> > > at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
> > > at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
> > > at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1160)
> > > at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:511)
> > > at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
> > > at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1092)
> > > at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
> > > at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
> > > at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
> > > at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
> > > at org.eclipse.jetty.server.Server.handle(Server.java:518)
> > > at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:308)
> > > at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:244)
> > > at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273)
> > > at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)
> > > at org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
> > > at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceAndRun(ExecuteProduceConsume.java:246)
> > > at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:156)
> > > at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:654)
> > > at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:572)
> > > at java.lang.Thread.run(Thread.java:745)
> > >
> > > After this one by one all collection got down and We can not identify that
why this ERROR came and what is meaning of that error.
> > >
> > > In Production my version
> > > zookeeper : 3.4.6
> > > java : 8
> > > solrj : 5.2.0
> > >
> > > Please guide me...
> > > From: Erick Erickson <erickerickson@gmail.com>
> > > Sent: Wednesday, April 10, 2019 8:35 PM
> > > To: solr-user@lucene.apache.org
> > > Subject: Re: Shard and replica went down in Solr 6.1.0
> > >
> > > Solr will not go down due to “performance warning”. All that means is that
you’re opening too many searchers in a short time. If you’re somehow opening a huge number
of searchers, then maybe but IIRC you can’t really open more than two. And don’t think
you can “fix” this by upping maxWarmingSearchers, that’ll only make the problem worse
as every new searcher chews up memory that’s kept until the old searcher is done with outstanding
requests.
> > >
> > > Which is weird because with those settings you shouldn’t be opening _any_
new searchers. So my guess is that some external client is doing that and this is usually
an anti-pattern. Don’t do it, please. Just set your soft commit to what you need and leave
it at that. As long as possible. Here are all the gory details:
> > > https://lucidworks.com/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/
> > >
> > > I’d also not bother with maxDocs, that’s much less predictable than just
setting a reasonable time for autocommit, a minute or two is usually reasonable especially
when openSearcher is false.
> > >
> > > Not sure what’s really generating that error, take a look at all your other
Solr logs to see if there’s a cause.
> > >
> > > Best,
> > > Erick
> > >
> > >
> > > > On Apr 10, 2019, at 5:21 AM, vishal patel <vishalpatel200928@outlook.com>
wrote:
> > > >
> > > > I have 2 shards and 2 replicas of Solr 6.1.0. one shard and one replica
went down and I got below ERROR
> > > >
> > > > 2019-04-08 12:54:01.469 INFO  (commitScheduler-131-thread-1) [c:products
s:shard1 r:core_node1 x:product1] o.a.s.s.SolrIndexSearcher Opening [Searcher@24b9127f[product1]
main]
> > > > 2019-04-08 12:54:01.468 INFO  (commitScheduler-110-thread-1) [c:product2
s:shard1 r:core_node1 x:product2] o.a.s.c.SolrDeletionPolicy SolrDeletionPolicy.onCommit:
commits: num=2
> > > > commit{dir=G:\SolrCloud\solr1\server\solr\product2\data\index.20180412060518798,segFN=segments_he5,generation=22541}
> > > > commit{dir=G:\SolrCloud\solr1\server\solr\product2\data\index.20180412060518798,segFN=segments_he6,generation=22542}
> > > > 2019-04-08 12:54:01.556 INFO  (commitScheduler-110-thread-1) [c:product2
s:shard1 r:core_node1 x:product2] o.a.s.c.SolrDeletionPolicy newest commit generation = 22542
> > > > 2019-04-08 12:54:01.465 WARN (commitScheduler-136-thread-1) [c:product3
s:shard1 r:core_node1 x:product3] o.a.s.c.SolrCore [product3] PERFORMANCE WARNING: Overlapping
onDeckSearchers=2
> > > >
> > > > 2019-04-08 12:54:01.534 ERROR (updateExecutor-2-thread-36358-processing-http:////10.101.111.80:8983//solr//product3
x:product3 r:core_node1 n:10.102.119.85:8983_solr s:shard1 c:product3) [c:product3 s:shard1
r:core_node1 x:product3] o.a.s.u.StreamingSolrClients error
> > > > org.apache.solr.common.SolrException: Service Unavailable
> > > >
> > > > request: http://10.101.111.80:8983/solr/product3/update?update.distrib=FROMLEADER&distrib.from=http%3A%2F%2F10.102.119.85%3A8983%2Fsolr%2Fproduct3%2F&wt=javabin&version=2
> > > > at org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.sendUpdateStream(ConcurrentUpdateSolrClient.java:320)
> > > > at org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.run(ConcurrentUpdateSolrClient.java:185)
> > > > at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$22(ExecutorUtil.java:229)
> > > > at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$$Lambda$3/30175207.run(Unknown
Source)
> > > > at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> > > > at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> > > > at java.lang.Thread.run(Thread.java:745)
> > > >
> > > > Note : product1,product2 and product3 are my collection.
> > > >
> > > > In my solrconfig.xml
> > > > <autoCommit>
> > > > <maxTime>600000</maxTime>
> > > >    <maxDocs>20000</maxDocs>
> > > >    <openSearcher>false</openSearcher>
> > > > </autoCommit>
> > > >
> > > > <autoSoftCommit>
> > > > <maxTime>${solr.autoSoftCommit.maxTime:-1}</maxTime>
> > > > </autoSoftCommit>
> > > > <maxWarmingSearchers>2</maxWarmingSearchers>
> > > >
> > > > There are many documents committed at that time and I found out so many
commitScheduler threads in Log.
> > > > Solr went down due to warn PERFORMANCE WARNING: Overlapping onDeckSearchers=2
is it possible?
> > > > Need to update my autoCommit or  maxWarmingSearchers?
> > > >
> > > > Sent from Outlook<http://aka.ms/weboutlook>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message