lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mark Miller (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SOLR-4046) An instance of CloudSolrServer is not able to handle consecutive request on different collections o.a.
Date Wed, 07 Nov 2012 16:01:15 GMT

    [ https://issues.apache.org/jira/browse/SOLR-4046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13492447#comment-13492447
] 

Mark Miller commented on SOLR-4046:
-----------------------------------

Yeah, I honestly had the same thought when I was fixing - I almost just dropped the caching
completely - it didn't seem like the perf would be much different and the code is complicated.
It's mostly a random dice roll that I ended up keeping the caching. Mostly, I was too lazy
to test if it mattered (even though intuitively, I doubt it would).

I'll keep this open until I'm home from Germany and can re look at it.
                
> An instance of CloudSolrServer is not able to handle consecutive request on different
collections o.a.
> ------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-4046
>                 URL: https://issues.apache.org/jira/browse/SOLR-4046
>             Project: Solr
>          Issue Type: Bug
>          Components: clients - java, SolrCloud
>    Affects Versions: 4.0
>         Environment: Solr 4.0.0. Actually revision 1394844 on branch lucene_solr_4_0
but I believe that is the same
>            Reporter: Per Steffensen
>            Priority: Critical
>         Attachments: SOLR-4046.patch
>
>
> CloudSolrServer saves urlList, leaderUrlList and replicasList on instance level, and
only recalculates those lists in case of clusterState changes. The values calculated for the
lists will be different for different target-collections. Therefore they also ought to recalculated
for a request R, if the target-collection for R is different from the target-collection for
the request handled just before R by the same CloudSolrServer instance.
> Another problem with the implementation in CloudSolrServer is with the lastClusterStateHashCode.
lastClusterStateHashCode is updated when the first request after a clusterState-change is
handled. Before the lastClusterStateHashCode is updated one of the following two sets of lists
are updated:
> * In case sendToLeader==true for the request: leaderUrlList and replicasList  are updated,
but not urlList
> * In case sendToLeader==false for the request: urlList is updated, but not leaderUrlList
and replicasList
> But the lastClusterStateHashCode is always updated. So even though there was just one
collection in the world there is a problem: If the first request after a clusterState-change
is a sendToLeader==true-request urlList will (potentially) be wrong (and will not be recalculated)
for the next sendToLeader==false-request to the same CloudSolrServer instance. If the first
request after a clusterState-change is a sendToLeader==false-request leaderUrlList and replicasList
will (potentially) be wrong (and will not be recalculated) for the next sendToLeader==true-request
to the same CloudSolrServer instance.
> Besides that it is a very bad idea to have instance- and local-method-variables with
the same name. CloudSolrServer has an instance variable called urlList and method CloudSolrServer.request
has a local-method-variable called urlList and the method also operates on instance variable
urlList. This makes the code hard to read.
> Havnt made a test in Apache Solr regi to reproduce the main error (the one mentioned
at the top above) but I guess you can easily do it yourself:
> Make a setup with two collections "collection1" and "collection2" - no default collection.
Add some documents to "collection2" (without any autocommit). Then do cloudSolrServer.commit("collection1")
and afterwards cloudSolrServer.commit("collection2") (use same instance of CloudSolrServer).
Then try to search collection2 for the documents you inserted into it. They ought to be found,
but are not, because the cloudSolrServer.commit("collection2") will not do a commit of collection2
- it will actually do a commit of collection1.
> Well, actually you cant do cloudSolrServer.commit(<collection-name>) (the method
doesnt exist), but that ought to be corrected too. But you can do the following instead:
> {code}
> UpdateRequest req = new UpdateRequest();
> req.setAction(UpdateRequest.ACTION.COMMIT, true, true);
> req.setParam(CoreAdminParams.COLLECTION, <collection-name>);
> req.process(cloudSolrServer);
> {code}
> In general I think you should add misc tests to your test-suite - tests that run Solr-clusters
with more than one collection and makes clever tests on that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message