lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Carlson <mich...@cycloneinteractive.com>
Subject Strange Scoring Results
Date Wed, 16 Jul 2014 15:36:38 GMT
Hey All - 

I’m a Solr newbie in need of some help.

I’m using Apache Nutch to crawl a site and populate a Solr core, which we then use to query
search results. I’ve got it all up and running, but the Solr scoring results I get don’t
seem to make any sense. Let’s take the following query as an example:

content:devlearn 2014 registration information

I have a page with a title of "DevLearn 2014 Conference & Expo - Registration Information”
and a url of "www.mydomain.com/DevLearn/content/3426/devlearn-2014-conference--expo--registration-information/“
which has multiple instances of all terms in the content field. I would expect this document
to be returned at the top of the list, since in addition to being in the content field, all
terms are in both the title and the url, which I’m boosting for. Instead, it returns as
number 3320 in the results with a score of 0. Meanwhile, 3319 other pages return with higher
scores, and all of these have fewer instances of the terms in the content field, and one or
fewer of the terms in the title or url.

Below is the select requestHandler section from my solrconfig.xml which shows the query select
defaults. Let me know if I should include more of this file or any other information:

<requestHandler name="/select" class="solr.SearchHandler">
  
     <lst name="defaults">
       <str name="echoParams">explicit</str>
       <int name="rows">10</int>
       <str name="df">text</str>
	   
       <str name="hl">on</str>
       <str name="hl.fl">content</str>
       <str name="hl.encoder">html</str>
       <str name="hl.simple.pre">&lt;strong&gt;</str>
       <str name="hl.simple.post">&lt;/strong&gt;</str>
       <str name="f.content.hl.snippets">1</str>
       <str name="f.content.hl.fragsize">200</str>
       <str name="f.content.hl.alternateField">content</str>
       <str name="f.content.hl.maxAlternateFieldLength">750</str>

       <str name="defType">edismax</str>
       <str name="qf">
          content^0.5 url^10.0 title^10.0
       </str>
       <str name="df">content</str>
       <str name="mm">100%</str>
       <str name="q.alt">*:*</str>
       <str name="rows">10</str>
       <str name="fl">*,score</str>
       <str name="pf">
           content^0.5 url^10.0 title^10.0
       </str>
       <str name="ps">100</str>

     </lst>
</requestHandler>






Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message