lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Hostetter <hossman_luc...@fucit.org>
Subject Re: Is it possible to use a custom similarity class to cause extra terms in a field to lower score?
Date Thu, 17 May 2007 23:34:24 GMT

: Terminator 2
: Terminator 2: Judgment Day
:
: And I score them against the query +title:(Terminator 2)

: Would there be some method or combination of methods in Similarity
: that I could easily override to allow me to penalize the second item
: because it had "unused terms"?

that's what the DefaultSimilarity does, it uses the (length)norm
information stored when the documents are indexed to know which one is a
better match (because it matches on a shorter field)

I you aren'tseeing that behavior then perhaps you turned omitNorms for
that field, or perhaps the byte encoding is making the distinction between
your various terms too small -- overriding the lengthNorm function and
reindexing might help.



-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message