lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Markus Jelsma (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (SOLR-8061) GlobalStats, incorrect order of debug results
Date Wed, 16 Sep 2015 11:52:46 GMT

    [ https://issues.apache.org/jira/browse/SOLR-8061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14747350#comment-14747350
] 

Markus Jelsma edited comment on SOLR-8061 at 9/16/15 11:52 AM:
---------------------------------------------------------------

Ah, it is a duplicate indeed but 7759 is not yet marked for 5.4 or any version. I am actually
investigating a problem with GlobalStats, where in a sharded environment and ExactStats enabled,
different scores are returned for the same queries. Without debugging component, it is very
hard.


was (Author: markus17):
Ah, it is a duplicate indeed but 7759 is not yet marked for 5.4 or any version.

> GlobalStats, incorrect order of debug results
> ---------------------------------------------
>
>                 Key: SOLR-8061
>                 URL: https://issues.apache.org/jira/browse/SOLR-8061
>             Project: Solr
>          Issue Type: Bug
>    Affects Versions: 5.3
>            Reporter: Markus Jelsma
>             Fix For: 5.4
>
>         Attachments: 8601_full_solr_response.txt
>
>
> It is very impossible to reliably debug the scoring results when GlobalStats is enabled.
Here are the top 5 ID's and their scores:
> {code}
> <result name="response" numFound="1258" start="0" maxScore="100.59861">
>   <doc>
>     <str name="id">http://www.example.org/medicijnen/paracetamol?product=paracetamol</str>
>     <float name="score">100.59861</float></doc>
>   <doc>
>     <str name="id">http://www.example.org/medicijnen/paracetamol?product=roter-paracetamol</str>
>     <float name="score">100.42987</float></doc>
>   <doc>
>     <str name="id">http://www.example.org/medicijnen/paracetamol?product=sinaspril-paracetamol</str>
>     <float name="score">100.42986</float></doc>
>   <doc>
>     <str name="id">http://www.example.org/medicijnen/paracetamol</str>
>     <float name="score">99.93343</float></doc>
>   <doc>
>     <str name="id">http://www.example.org/producten/paracetamolvitamine-c</str>
>     <float name="score">99.762596</float></doc>
> {code}
> This is the final debugging information, shortened for readability, full response attached
> {code}
> <lst name="explain">
>     <str name="http://www.apotheek.nl/medicijnen/paracetamol?product=paracetamol">
> 101.406906 = max plus 0.65 times others of:
>   21.73707 = weight(content_nl:paracetamol^2.2 in 39285) [], result of:
>     21.73707 = score(doc=39285,freq=59.0 = termFreq=59.0
> ...
> </str>
>     <str name="http://www.apotheek.nl/medicijnen/paracetamol?product=roter-paracetamol">
> 99.26059 = max plus 0.65 times others of:
>   21.501307 = weight(content_nl:paracetamol^2.2 in 3186) [], result of:
>     21.501307 = score(doc=3186,freq=59.0 = termFreq=59.0
> ...
> </str>
>     <str name="http://www.apotheek.nl/medicijnen/paracetamol?product=sinaspril-paracetamol">
> 99.26059 = max plus 0.65 times others of:
>   21.501307 = weight(content_nl:paracetamol^2.2 in 3219) [], result of:
>     21.501307 = score(doc=3219,freq=59.0 = termFreq=59.0
> ...
> ), product of:
>       7.4 = boost
>       8.409361 = idf(docFreq=13, maxDocs=60599)
>       1.1269082 = tfNorm, computed from:
>         1.0 = termFreq=1.0
>         0.3 = parameter k1
>         0.75 = parameter b
>         11.450568 = avgFieldLength
>         4.0 = fieldLength
> </str>
>     <str name="http://www.apotheek.nl/medicijnen/paracetamol">
> 100.7385 = max plus 0.65 times others of:
>   21.73707 = weight(content_nl:paracetamol^2.2 in 39673) [], result of:
>     21.73707 = score(doc=39673,freq=59.0 = termFreq=59.0
> ...
> </str>
>     <str name="http://www.apotheek.nl/producten/paracetamolvitamine-c">
> 100.57981 = max plus 0.65 times others of:
>   17.886435 = weight(content_nl:paracetamol^2.2 in 45385) [], result of:
>     17.886435 = score(doc=45385,freq=5.0 = termFreq=5.0
> ...
> </str>
> {code}
> I comparad docId's with a retrieved resultset without GlobalStats, the order of document
ID's is correct, the docId's match. It looks like the debug scores themselves are incorrect,
and thus also wrongly sorted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message