lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Shalin Shekhar Mangar (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SOLR-6235) SyncSliceTest fails on jenkins with no live servers available error
Date Thu, 10 Jul 2014 15:58:04 GMT

    [ https://issues.apache.org/jira/browse/SOLR-6235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14057613#comment-14057613
] 

Shalin Shekhar Mangar commented on SOLR-6235:
---------------------------------------------

bq. but why would all the cores have the name "collection1"? Is that valid or an indication
of a problem upstream from this code?

The reasons are what Mark said but it is a supported use-case and pretty common. Imagine stock
solr running on 4 nodes - each node would have the same collection1 core name.

bq. Also, you raise a good point about all replicas thinking they are in leader-initiated
recovery (LIR). In ElectionContext, when running shouldIBeLeader, the node will choose to
not be the leader if it is in LIR. However, this could lead to no leader. My thinking there
is the state is bad enough that we would need manual intervention to clear one of the LIR
znodes to allow a replica to get past this point. But maybe we can do better here?

Good question. With careful use of minRf, the user can retry operations and maintain consistency
even if we arbitrarily elect a leader in this case. But most people won't use minRf and don't
care about consistency as much as availability. For them there should be a way to get out
of this mess easily. We can have a collection property (boolean + timeout value) to force
elect a leader even if all shards were in LIR. What do you think?

> SyncSliceTest fails on jenkins with no live servers available error
> -------------------------------------------------------------------
>
>                 Key: SOLR-6235
>                 URL: https://issues.apache.org/jira/browse/SOLR-6235
>             Project: Solr
>          Issue Type: Bug
>          Components: SolrCloud, Tests
>            Reporter: Shalin Shekhar Mangar
>            Assignee: Shalin Shekhar Mangar
>             Fix For: 4.10
>
>
> {code}
> 1 tests failed.
> FAILED:  org.apache.solr.cloud.SyncSliceTest.testDistribSearch
> Error Message:
> No live SolrServers available to handle this request
> Stack Trace:
> org.apache.solr.client.solrj.SolrServerException: No live SolrServers available to handle
this request
>         at __randomizedtesting.SeedInfo.seed([685C57B3F25C854B:E9BAD9AB8503E577]:0)
>         at org.apache.solr.client.solrj.impl.LBHttpSolrServer.request(LBHttpSolrServer.java:317)
>         at org.apache.solr.client.solrj.impl.CloudSolrServer.request(CloudSolrServer.java:659)
>         at org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:91)
>         at org.apache.solr.client.solrj.SolrServer.query(SolrServer.java:301)
>         at org.apache.solr.cloud.AbstractFullDistribZkTestBase.checkShardConsistency(AbstractFullDistribZkTestBase.java:1149)
>         at org.apache.solr.cloud.AbstractFullDistribZkTestBase.checkShardConsistency(AbstractFullDistribZkTestBase.java:1118)
>         at org.apache.solr.cloud.SyncSliceTest.doTest(SyncSliceTest.java:236)
>         at org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:865)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message