lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Hostetter <hossman_luc...@fucit.org>
Subject Re: to prevent number-of-matching-terms in contributing score
Date Thu, 17 Nov 2011 01:36:33 GMT

:  1. "omitTermFreqAndPositions" is very straightforward but if I avoid
: positions I'll refuse to serve phrase queries. I had searched for this in

but do you really need phrase queries on your "cat" field?  i thought the 
point was to have simple matching on those terms?

:  2. Function query seemed nice (though strange because I never used it
: before) and I gave it a few hours but that too did not seem to solve my
: requirement. The "artificial" score we are generating is getting multiplied
: into rest of the score which includes score due to "cat" field as well. (I
: can not remove "cat" from "qf" as I have to search there). It is only that
: I don't want this field's score on the basis of matching "tf".

I don't think i realized you were using dismax ... if you just want a 
match on "cat" to help determine if the document is a match, but not have 
*any* impact on score, you could just set the qf boost to 0 (ie: 
qf=title^10 cat^0) but i'm not sure if that's really what you want.

: After spending some hours on function queries I finally reached on
: following query

Honestly: i'm not really following what you tried there because of the 
formatting applied by your email client ... it seemed to be making tons of 
hyperlinks out of peices of the URL.

Looking at your query explanation however the problem seems to be that you 
are still using the relevancy score of the matches on the "cat" field, 
instead of *just* using hte function boost...

: But debugging the query showed that the boost value ($cat_boost) is being
: multiplied into a value which is generated with the help of "cat" field
: thus resulting in different scores for 1 and 3 (similarly for 2 and 4).
: 
: 1.2942866 = (MATCH) boost(+(title:chair | cat:chair)~0.01
: (),map(query(cat:chair,def=-1.0),0.0,1000.0,1.0)), product of:

...my point before was to take "cat:chair" out of the "main" part of your 
query, and *only* put it in the boost function.  if you are using dismax, 
the "qf=cat^0" suggestion mentioned above *combined* with your boost 
function will probably get you what you want (i think)

: I was thinking there should be some hook or plugin (or anything) which
: could just change the score calculation formula *for a particular field*.
: There is a function in DefaultSimilarity class - *public float tf(float
: freq)* but that does not mention the field name. Is there a possibility to
: look into this direction?

on trunk, there is a distinct Similarity object per fieldtype, so you 
could certain look at that -- but you are correct that in 3x there is no 
way to override the tf() function on a per field basis.


-Hoss

Mime
View raw message