lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Carlson <mich...@cycloneinteractive.com>
Subject Re: Strange Scoring Results
Date Thu, 17 Jul 2014 15:47:13 GMT
Okay, not sure if the following will help with troubleshooting, but here are a couple of links
that show visual representations of how the scores for these results are calculated. For these
queries I got rid of my boosts to make the results easier to read.

Here are the top 10 scoring results:
http://explain.solr.pl/explains/6mfdyixa

Here is the page which I think should score higher than all of them:
http://explain.solr.pl/explains/6xo77shs

Why does the second link show 0 for all of the matches despite the existence of matches?



On Jul 16, 2014, at 11:36 AM, Michael Carlson <michael@cycloneinteractive.com> wrote:

> Hey All - 
> 
> I’m a Solr newbie in need of some help.
> 
> I’m using Apache Nutch to crawl a site and populate a Solr core, which we then use
to query search results. I’ve got it all up and running, but the Solr scoring results I
get don’t seem to make any sense. Let’s take the following query as an example:
> 
> content:devlearn 2014 registration information
> 
> I have a page with a title of "DevLearn 2014 Conference & Expo - Registration Information”
and a url of "www.mydomain.com/DevLearn/content/3426/devlearn-2014-conference--expo--registration-information/“
which has multiple instances of all terms in the content field. I would expect this document
to be returned at the top of the list, since in addition to being in the content field, all
terms are in both the title and the url, which I’m boosting for. Instead, it returns as
number 3320 in the results with a score of 0. Meanwhile, 3319 other pages return with higher
scores, and all of these have fewer instances of the terms in the content field, and one or
fewer of the terms in the title or url.
> 
> Below is the select requestHandler section from my solrconfig.xml which shows the query
select defaults. Let me know if I should include more of this file or any other information:
> 
> <requestHandler name="/select" class="solr.SearchHandler">
> 
>     <lst name="defaults">
>       <str name="echoParams">explicit</str>
>       <int name="rows">10</int>
>       <str name="df">text</str>
> 	   
>       <str name="hl">on</str>
>       <str name="hl.fl">content</str>
>       <str name="hl.encoder">html</str>
>       <str name="hl.simple.pre">&lt;strong&gt;</str>
>       <str name="hl.simple.post">&lt;/strong&gt;</str>
>       <str name="f.content.hl.snippets">1</str>
>       <str name="f.content.hl.fragsize">200</str>
>       <str name="f.content.hl.alternateField">content</str>
>       <str name="f.content.hl.maxAlternateFieldLength">750</str>
> 
>       <str name="defType">edismax</str>
>       <str name="qf">
>          content^0.5 url^10.0 title^10.0
>       </str>
>       <str name="df">content</str>
>       <str name="mm">100%</str>
>       <str name="q.alt">*:*</str>
>       <str name="rows">10</str>
>       <str name="fl">*,score</str>
>       <str name="pf">
>           content^0.5 url^10.0 title^10.0
>       </str>
>       <str name="ps">100</str>
> 
>     </lst>
> </requestHandler>
> 
> 
> 
> 
> 


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message