lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lucky Sharma <>
Subject Re: Different results due to sharding and problems with interesting terms in MLT
Date Sat, 28 Sep 2019 04:29:05 GMT
Hi Salman,

1. For 1st one:
     One suggestion could be, don't create  [@, ., -, _, +, #, *] as
individual tokens. I guess you need to update your tokenizer in that case.

2. For the second issue, is the score of both the results same? If the
score is same and the queries are same then the reason would be  Lucene doc
ID. I have also observed the same thing in Solr 7.6.0, and my reason for
that was, docID for the same doc could be different in both the nodes. so
for making the same record order what you can do is, add  "id desc" as very
last stage of sorting

Lucky Sharma

On Sat, 28 Sep, 2019, 8:22 am Salmaan Rashid Syed, <> wrote:

> Hi Solr Users,
> I have two questions,
> 1) I am working on Solr 7.6 and I have incorporated MLT feature into it. I
> need to allow users to search on emails and skills, so I have allowed few
> of the special characters such as [@, ., -, _, +, #, *]. I am not using
> stemmer as it is removing letter "s" from many of the useful words like
> "AngularJS" to "AngularJ".
> Now when I enter a processed text as query into the search bar, I get "."
> as the "*most interesting term*" boosted by the highest order usually. I
> can't figure out how to remove this from interesting terms without removing
> it from the field I am searching in.
> 2) I have 2 shards per collections on two nodes 8983 and 7574 in cloud
> mode. I am getting different results for same query.
> I have come to know through reading forums and documentation that this is
> happening due to sharding and due to calculation of stats on individual
> sharding rather than on entire collection. So I implemented one of the
> solutions mentioned in forum/documentations in solrconfig.xml as follows,
> <statsCache class=""/>
> It still doesn't works and gives different results for same query. Please
> let me know what can be done to avoid these issues.
> Regards,
> Salmaan

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message