lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jodi Showers <j...@homestars.com>
Subject Document Boosts don't seem to be having an effect
Date Fri, 22 May 2009 21:35:39 GMT
Greetings - first post here - hoping someone can direct me - grasping  
at straws. thank you in advance. Jodi


I'm trying to tune the sort order using a combination of document and  
query time boosts. When searching for the term 'builder' with almost  
identical quantities of this term, and a much larger document boost  
for doc #, it seems to be the score should be much higher for doc #1.

Doc 1 boost - 21.542363409468
Doc 1 scoring - 6.7017727
Doc 1 boost - 12.6390725007673
Doc 2 scoring - 8.00193

All fields being searched on are _t fields - all are:

<dynamicField name="*_t" type="text" indexed="true" stored="false"/>

where text is defined as:

     <fieldType name="text" class="solr.TextField"  
positionIncrementGap="100">
       <analyzer type="index">
         <tokenizer class="solr.WhitespaceTokenizerFactory"/>
         <filter class="solr.StopFilterFactory" ignoreCase="true"  
words="stopwords.txt"/>
         <filter class="solr.WordDelimiterFilterFactory"  
generateWordParts="1" generateNumberParts="1" catenateWords="1"  
catenateNumbers="1" catenateAll="0"/>
         <filter class="solr.LowerCaseFilterFactory"/>
         <filter class="solr.EnglishPorterFilterFactory"  
protected="protwords.txt"/>
         <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
       </analyzer>
       <analyzer type="query">
         <tokenizer class="solr.WhitespaceTokenizerFactory"/>
         <filter class="solr.SynonymFilterFactory"  
synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
         <filter class="solr.StopFilterFactory" ignoreCase="true"  
words="stopwords.txt"/>
         <filter class="solr.WordDelimiterFilterFactory"  
generateWordParts="1" generateNumberParts="1" catenateWords="0"  
catenateNumbers="0" catenateAll="0"/>
         <filter class="solr.LowerCaseFilterFactory"/>
         <filter class="solr.EnglishPorterFilterFactory"  
protected="protwords.txt"/>
         <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
       </analyzer>
     </fieldType>

omitNorms isn't indicated - I've tried adding it to the Text  
definition - but no change.


To illustrate I have the following documents (I may be overly verbose):

#1

<doc boost='21.5423634094682'><field name='type_t'>Company</ 
field><field name='pk_i'>211623</field><field name='id'>Company: 
211623</field><field name='lng'>-79.3761</field><field  
name='lat'>43.6496</field><field name='name_s' boost='0.0'>J. Roberts  
&amp;amp; Associates Interiors</field><field name='sort_name_t'  
boost='1.0'>J. Roberts &amp;amp; Associates Interiors</field><field  
name='profile_search_s' boost='1.0'>J. ROBERTS &amp;amp; ASSOCIATES  
INTERIORS
&amp;lt;br /&amp;gt;-30 years construction experience
&amp;lt;br /&amp;gt;-Quality service, on time, on budget
&amp;lt;br /&amp;gt;-All sub-trades are licensed and certified - We  
are fully licensed, insured and covered by WSIB.
&amp;lt;br /&amp;gt;-References available from our satisfied  
clients.&amp;lt;br/&amp;gt;

&amp;lt;br/&amp;gt;BUILDER
&amp;lt;br /&amp;gt;-Custom Homes , Additions and Major Renovations
&amp;lt;br /&amp;gt;-Project Management and Planning - Design /  
Build , Engineering , Permits
&amp;lt;br /&amp;gt;-Renovation Advisors for DIY homeowners
&amp;lt;br /&amp;gt;KITCHENS &amp;amp; INTERIORS
&amp;lt;br /&amp;gt;-Design and planning
&amp;lt;br /&amp;gt;-Custom kitchens and interior renovations
&amp;lt;br /&amp;gt;-Complete painting services
&amp;lt;br /&amp;gt;STRUCTURAL SERVICES
&amp;lt;br /&amp;gt;-Engineering , permits required, Foundations and  
underpinning
&amp;lt;br /&amp;gt;-Wall removal and beam installation&amp;lt;br/ 
&amp;gt;

&amp;lt;br/&amp;gt;Maintenance and Repairs Services
&amp;lt;br /&amp;gt;-Masonry repairs and Stone work  (in house staff )
&amp;lt;br /&amp;gt;-Windows and doors
&amp;lt;br /&amp;gt;-Eave troughs and metal work&amp;lt;br/&amp;gt;

&amp;lt;br/&amp;gt;</field><field name='profile_s' boost='0.0'>J.  
ROBERTS &amp;amp; ASSOCIATES INTERIORS
-30 years construction experience
-Quality service, on time, on budget
-All sub-trades are licensed and certified - We are fully licensed,  
insured and covered by WSIB.
-References available from our satisfied clients.

BUILDER
-Custom Homes , Additions and Major Renovations
-Project Management and Planning - Design / Build , Engineering ,  
Permits
-Renovation Advisors for DIY homeowners
KITCHENS &amp;amp; INTERIORS
-Design and planning
-Custom kitchens and interior renovations
-Complete painting services
STRUCTURAL SERVICES
-Engineering , permits required, Foundations and underpinning
-Wall removal and beam installation

Maintenance and Repairs Services
-Masonry repairs and Stone work  (in house staff )
-Windows and doors
-Eave troughs and metal work

</field><field name='categories_info_s' boost='1.0'>Builders, Home  
Builders, Home Contractors, residential builders, residential  
contractors, home construction companies, design build companies,  
design build contractors, residential building  
contractor,Foundations, ,General Contractors, Residential General  
Contractor, Building Contractor, Additions, Remodeling Contractor,   
Renovation, Builder,Home Additions, General contractor, home  
improvement, building addition, home expansion, house  
expansion,Kitchen &amp;amp; Bathroom - Cabinets &amp;amp; Design,  
Kitchen Cabinet And Counter, Kitchen Cabinet Hardware, Bathroom  
Cabinet, Bathroom Wall Cabinet, Bathroom Sink Cabinet,Kitchen Planning  
&amp;amp; Renovation, Kitchen Planning And Design, Kitchen Cabinet  
Planning, Kitchen Design, Kitchen Remodeling,Masonry &amp;amp;  
Bricklaying, Masonry Supply, Masonry Contractor, Concrete Masonry,  
Stone Masonry, Brick Laying Technique, Brick Laying Pattern, building  
a fireplace, constructing a firplace, stone fence, stone wall, brick  
wall, masonry repair, brick repairs,Paint &amp;amp; Wallpaper  
Contractors, Paint Colors, Paint Store, Paint Brush, Paint Shop, Home  
Wallpaper, Home Decorating, Wallpaper, Paint colour advice, paint  
colour consultants, wallpapering,</field><field  
name='reviews_info_cache_t' boost='0.0'></field><field  
name='position_rf' boost='0.0'>12.1115</field><field  
name='first_letter_of_name_t' boost='0.0'>J</field><field  
name='country_t' boost='0.0'>CANADA</field><field name='avg_rating_f'  
boost='0.0'>9.8913</field><field name='number_of_photos_ri'  
boost='0.0'>66</field><field name='number_of_reviews_i'  
boost='0.0'>23</field><field name='number_of_reviews_ri'  
boost='0.0'>23</field><field name='listing_i' boost='0.0'>2</ 
field><field name='state_s' boost='0.0'>approved</field><field  
name='category_name_facet'>Builders</field><field  
name='category_name_facet'>Foundations</field><field  
name='category_name_facet'>General Contractors</field><field  
name='category_name_facet'>Home Additions</field><field  
name='category_name_facet'>Kitchen &amp; Bathroom - Cabinets &amp;  
Design</field><field name='category_name_facet'>Kitchen Planning &amp;  
Renovation</field><field name='category_name_facet'>Masonry &amp;  
Bricklaying</field><field name='category_name_facet'>Paint &amp;  
Wallpaper Contractors</field></doc>

document boost: 21.5423634094682

#2

<doc boost='12.6390725007673'><field name='type_t'>Company</ 
field><field name='pk_i'>202883</field><field name='id'>Company: 
202883</field><field name='lng'>-79.4759</field><field  
name='lat'>43.6663</field><field name='name_s' boost='0.0'>D-CAM  
CONSTRUCTION INC.</field><field name='sort_name_t' boost='1.0'>D-CAM  
CONSTRUCTION INC.</field><field name='profile_search_s'  
boost='1.0'>&amp;lt;p&amp;gt;D-CAM CONSTRUCTION is a Custom Home  
Builder, and Restoration Specialist that offers all aspects of  
construction from foundation to interior and exterior finishing,  
design, development &amp;amp;amp; architectural planning, site  
supervision, and project management, while providing great service,  
quality workmanship and value. All work is completed to Ontario  
Building Code requirements.&amp;lt;/p&amp;gt;</field><field  
name='profile_s' boost='0.0'>D-CAM CONSTRUCTION is a Custom Home  
Builder, and Restoration Specialist that offers all aspects of  
construction from foundation to interior and exterior finishing,  
design, development &amp;amp;amp; architectural planning, site  
supervision, and project management, while providing great service,  
quality workmanship and value. All work is completed to Ontario  
Building Code requirements.</field><field name='categories_info_s'  
boost='1.0'>Builders, Home Builders, Home Contractors, residential  
builders, residential contractors, home construction companies, design  
build companies, design build contractors, residential building  
contractor,</field><field name='reviews_info_cache_t' boost='0.0'></ 
field><field name='position_rf' boost='0.0'>11.2444</field><field  
name='first_letter_of_name_t' boost='0.0'>D</field><field  
name='country_t' boost='0.0'>CANADA</field><field name='avg_rating_f'  
boost='0.0'>9.95833</field><field name='number_of_photos_ri'  
boost='0.0'>8</field><field name='number_of_reviews_i' boost='0.0'>6</ 
field><field name='number_of_reviews_ri' boost='0.0'>6</field><field  
name='listing_i' boost='0.0'>1</field><field name='state_s'  
boost='0.0'>approved</field><field  
name='category_name_facet'>Builders</field></doc>

document boost: 12.6390725007673

the following query returns both documents, but the Score is much  
different than I would expect. Both documents feature the word  
"builder" in about equal amounts, but the boost should push up #1.


http://localhost:8984/solr/select?lat=43.648565&radius=30.0&wt=ruby&rows=500&q=((sort_name_t:builders

^5+OR+profile_search_t:builders^1+OR+categories_info_t:builders^10)+AND 
+state_s:"approved"++AND+country_t:"CANADA")+AND 
+ 
type_t:Company 
&fl 
= 
pk_i,score,geo_distance&qt=geo&version=2.2&long=-79.385329&debugQuery=on

debugQuery returns the following info about those 2 documents:

#1

'Company:211623'=>'
6.7017727 = (MATCH) sum of:
   6.456047 = (MATCH) sum of:
     6.456047 = (MATCH) product of:
       9.684071 = (MATCH) sum of:
         2.8248532 = (MATCH) weight(profile_search_t:builder in  
2540575), product of:
           0.15211171 = queryWeight(profile_search_t:builder), product  
of:
             12.380608 = idf(docFreq=28, numDocs=1825490)
             0.012286288 = queryNorm
           18.570911 = (MATCH) fieldWeight(profile_search_t:builder in  
2540575), product of:
             1.0 = tf(termFreq(profile_search_t:builder)=1)
             12.380608 = idf(docFreq=28, numDocs=1825490)
             1.5 = fieldNorm(field=profile_search_t, doc=2540575)
         6.859217 = (MATCH) weight(categories_info_t:builder^10.0 in  
2540575), product of:
           0.4906972 = queryWeight(categories_info_t:builder^10.0),  
product of:
             10.0 = boost
             3.9938607 = idf(docFreq=127266, numDocs=1825490)
             0.012286288 = queryNorm
           13.978513 = (MATCH) fieldWeight(categories_info_t:builder  
in 2540575), product of:
             2.0 = tf(termFreq(categories_info_t:builder)=4)
             3.9938607 = idf(docFreq=127266, numDocs=1825490)
             1.75 = fieldNorm(field=categories_info_t, doc=2540575)
       0.6666667 = coord(2/3)
     0.0 = (MATCH) weight(state_s:approved in 2540575), product of:
       0.012335457 = queryWeight(state_s:approved), product of:
         1.004002 = idf(docFreq=2530433, numDocs=1825490)
         0.012286288 = queryNorm
       0.0 = (MATCH) fieldWeight(state_s:approved in 2540575), product  
of:
         1.0 = tf(termFreq(state_s:approved)=1)
         1.004002 = idf(docFreq=2530433, numDocs=1825490)
         0.0 = fieldNorm(field=state_s, doc=2540575)
     0.0 = (MATCH) weight(country_t:canada in 2540575), product of:
       0.027072202 = queryWeight(country_t:canada), product of:
         2.2034485 = idf(docFreq=762573, numDocs=1825490)
         0.012286288 = queryNorm
       0.0 = (MATCH) fieldWeight(country_t:canada in 2540575), product  
of:
         1.0 = tf(termFreq(country_t:canada)=1)
         2.2034485 = idf(docFreq=762573, numDocs=1825490)
         0.0 = fieldNorm(field=country_t, doc=2540575)
   0.24572554 = (MATCH) weight(type_t:compani in 2540575), product of:
     0.012286282 = queryWeight(type_t:compani), product of:
       0.9999996 = idf(docFreq=2540581, numDocs=1825490)
       0.012286288 = queryNorm
     19.999992 = (MATCH) fieldWeight(type_t:compani in 2540575),  
product of:
       1.0 = tf(termFreq(type_t:compani)=1)
       0.9999996 = idf(docFreq=2540581, numDocs=1825490)
       20.0 = fieldNorm(field=type_t, doc=2540575)

#2

'Company:202883'=>'
8.00193 = (MATCH) sum of:
   7.854495 = (MATCH) sum of:
     7.854495 = (MATCH) product of:
       11.781742 = (MATCH) sum of:
         3.295662 = (MATCH) weight(profile_search_t:builder in  
2540578), product of:
           0.15211171 = queryWeight(profile_search_t:builder), product  
of:
             12.380608 = idf(docFreq=28, numDocs=1825490)
             0.012286288 = queryNorm
           21.666063 = (MATCH) fieldWeight(profile_search_t:builder in  
2540578), product of:
             1.0 = tf(termFreq(profile_search_t:builder)=1)
             12.380608 = idf(docFreq=28, numDocs=1825490)
             1.75 = fieldNorm(field=profile_search_t, doc=2540578)
         8.48608 = (MATCH) weight(categories_info_t:builder^10.0 in  
2540578), product of:
           0.4906972 = queryWeight(categories_info_t:builder^10.0),  
product of:
             10.0 = boost
             3.9938607 = idf(docFreq=127266, numDocs=1825490)
             0.012286288 = queryNorm
           17.293924 = (MATCH) fieldWeight(categories_info_t:builder  
in 2540578), product of:
             1.7320508 = tf(termFreq(categories_info_t:builder)=3)
             3.9938607 = idf(docFreq=127266, numDocs=1825490)
             2.5 = fieldNorm(field=categories_info_t, doc=2540578)
       0.6666667 = coord(2/3)
     0.0 = (MATCH) weight(state_s:approved in 2540578), product of:
       0.012335457 = queryWeight(state_s:approved), product of:
         1.004002 = idf(docFreq=2530433, numDocs=1825490)
         0.012286288 = queryNorm
       0.0 = (MATCH) fieldWeight(state_s:approved in 2540578), product  
of:
         1.0 = tf(termFreq(state_s:approved)=1)
         1.004002 = idf(docFreq=2530433, numDocs=1825490)
         0.0 = fieldNorm(field=state_s, doc=2540578)
     0.0 = (MATCH) weight(country_t:canada in 2540578), product of:
       0.027072202 = queryWeight(country_t:canada), product of:
         2.2034485 = idf(docFreq=762573, numDocs=1825490)
         0.012286288 = queryNorm
       0.0 = (MATCH) fieldWeight(country_t:canada in 2540578), product  
of:
         1.0 = tf(termFreq(country_t:canada)=1)
         2.2034485 = idf(docFreq=762573, numDocs=1825490)
         0.0 = fieldNorm(field=country_t, doc=2540578)
   0.14743532 = (MATCH) weight(type_t:compani in 2540578), product of:
     0.012286282 = queryWeight(type_t:compani), product of:
       0.9999996 = idf(docFreq=2540581, numDocs=1825490)
       0.012286288 = queryNorm
     11.999995 = (MATCH) fieldWeight(type_t:compani in 2540578),  
product of:
       1.0 = tf(termFreq(type_t:compani)=1)
       0.9999996 = idf(docFreq=2540581, numDocs=1825490)
       12.0 = fieldNorm(field=type_t, doc=2540578)



Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message