lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Hostetter <hossman_luc...@fucit.org>
Subject Re: Any way to ignore repeated terms in TF calculation?
Date Sat, 03 Jan 2009 05:08:08 GMT

: you can solve your problem at search time by passing a custom Similarity class

In particular, consider subclassing SeweetSpotSimilrity ... instead of a 
truely "flat" tf function, it makes it easy for you to define a 
"sweetspot" so 2 instances of a word can score a lot higher then 1 
instance, and 3 can score slightly higher then 2, but 3-1000 instances can 
all score equivilently.


-Hoss

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message