lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeffery Yuan <yuanyun...@gmail.com>
Subject Duplicate docs in pagination with same score in Solr Cloud
Date Thu, 05 May 2016 06:50:50 GMT
We are running a match all query in solr cloud(for example, one shard with 3
replicas) - all data have same score. 

In solr, if docs have same score, they will be sorted by internal docid.
In solr clound, CloudSolrClient will send request to different shards. -
round robin

Same document may have different internal docid in different replicas.

So is it possible that in replicaA, docA docid is 1, in replicaB, its docid
is 11.
Then if the first query(get 0-10 rows) is sent to replicaA, the second
query(get 11-20 rows) is sent to replicaB, same doc will be returned twice?

Is this possible? 
If so, does this mean if we want to avoid duplicate data, we have to add
sort to break same-score-tie?

- This happens to me in my test environment: we get the solr data from
another machine, then make some change using atomic update, then do
pagination to export same data, found that it returns duplicate data.
-- My fix is to sort data by updateDate.




--
View this message in context: http://lucene.472066.n3.nabble.com/Duplicate-docs-in-pagination-with-same-score-in-Solr-Cloud-tp4274712.html
Sent from the Solr - User mailing list archive at Nabble.com.

Mime
View raw message