lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From alwaysbluesky <richkingmyst...@gmail.com>
Subject source cluster sends incorrect recovery request to target cluster when CDCR is enabled
Date Wed, 08 Jan 2020 03:27:35 GMT
Hi,

Running Solr 7.7.2, cluster with 3 replicas

When CDCR is enabled, one of the target nodes gets an incorrect recovery
request.

Below is the content of the state.json file from the zookeeper.

"shards":{"shard1":{
        "range":"80000000-7fffffff",
        "state":"active",
        "replicas":{
          "core_node3":{
            "core":"tbh_manuals_test_bi2_shard1_replica_n1",
            "base_url":"https://host1:8983/solr",
            "node_name":"host1:8983_solr",
            "state":"active",
            "type":"NRT",
            "force_set_state":"false"},
          "core_node5":{
            "core":"tbh_manuals_test_bi2_shard1_replica_n2",
            "base_url":"https://host2:8983/solr",
            "node_name":"host2:8983_solr",
            "state":"active",
            "type":"NRT",
            "force_set_state":"false",
            "leader":"true"},
          "core_node6":{
            "core":"tbh_manuals_test_bi2_shard1_replica_n4",
            "base_url":"https://host3:8983/solr",
            "node_name":"host3:8983_solr",
            "state":"active",
            "type":"NRT",
            "force_set_state":"false"}}}}}}

As we see, host1 doesn't have tbh_manuals_test_bi2_shard1_replica_n4.
However, host1 is receiving the request that
tbh_manuals_test_bi2_shard1_replica_n4 will be recovered, which cause
"unable to locate core" error.

Below is the entire error message of host1 on target cluster

2020-01-08 03:05:52.355 INFO  (zkCallback-7-thread-14) [   ]
o.a.s.c.c.ZkStateReader A cluster state change: [WatchedEvent
state:SyncConnected type:NodeDataChanged
path:/collections/tbh_manuals_test_bi2/state.json] for collection
[tbh_manuals_test_bi2] has occurred - updating... (live nodes size: [3])
2020-01-08 03:05:52.355 INFO  (zkCallback-7-thread-15) [   ]
o.a.s.c.c.ZkStateReader A cluster state change: [WatchedEvent
state:SyncConnected type:NodeDataChanged
path:/collections/tbh_manuals_test_bi2/state.json] for collection
[tbh_manuals_test_bi2] has occurred - updating... (live nodes size: [3])
2020-01-08 03:05:52.378 INFO  (qtp1155769010-87) [  
x:tbh_manuals_test_bi2_shard1_replica_n4] o.a.s.h.a.CoreAdminOperation It
has been requested that we recover:
core=tbh_manuals_test_bi2_shard1_replica_n4
2020-01-08 03:05:52.379 ERROR (qtp1155769010-87) [  
x:tbh_manuals_test_bi2_shard1_replica_n4] o.a.s.h.RequestHandlerBase
org.apache.solr.common.SolrException: Unable to locate core
tbh_manuals_test_bi2_shard1_replica_n4
	at
org.apache.solr.handler.admin.CoreAdminOperation.lambda$static$5(CoreAdminOperation.java:167)
	at
org.apache.solr.handler.admin.CoreAdminOperation.execute(CoreAdminOperation.java:360)
	at
org.apache.solr.handler.admin.CoreAdminHandler$CallInfo.call(CoreAdminHandler.java:396)
	at
org.apache.solr.handler.admin.CoreAdminHandler.handleRequestBody(CoreAdminHandler.java:180)
	at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:199)
	at org.apache.solr.servlet.HttpSolrCall.handleAdmin(HttpSolrCall.java:736)
	at
org.apache.solr.servlet.HttpSolrCall.handleAdminRequest(HttpSolrCall.java:717)
	at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:496)
	at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:395)
	at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:341)
	at
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1602)
	at
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:540)
	at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:146)
	at
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
	at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
	at
org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:257)
	at
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1588)
	at
org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:255)
	at
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1345)
	at
org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:203)
	at
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:480)
	at
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1557)
	at
org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:201)
	at
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1247)
	at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:144)
	at
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:220)
	at
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:126)
	at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
	at
org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
	at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
	at org.eclipse.jetty.server.Server.handle(Server.java:502)
	at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:364)
	at
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:260)
	at
org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:305)
	at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103)
	at
org.eclipse.jetty.io.ssl.SslConnection$DecryptedEndPoint.onFillable(SslConnection.java:411)
	at
org.eclipse.jetty.io.ssl.SslConnection.onFillable(SslConnection.java:305)
	at
org.eclipse.jetty.io.ssl.SslConnection$2.succeeded(SslConnection.java:159)
	at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103)
	at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:118)
	at
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:333)
	at
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:310)
	at
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:168)
	at
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:126)
	at
org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:366)
	at
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:765)
	at
org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:683)
	at java.base/java.lang.Thread.run(Thread.java:834)



How can I make source node send correct recovery request to target?



--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Mime
View raw message