lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jochen Wersdörfer <jochen-luc...@wersdoerfer.de>
Subject term frequency normalization
Date Tue, 03 Feb 2009 16:27:16 GMT
Hi,

i'd like to use the term frequency normalization described in

http://wiki.apache.org/lucene-java/TREC%202007%20Million%20Queries%20Track%20-%20IBM%20Haifa%20Team

so that the term frequency tf becomes

tf(f, d) = log(1 + feq(t, d)) / log(1 + avgFreq(d))

The easiest way to change the tf calculation would be overwriting
tf in an own implementation of Similarity like it's done in
SweetSpotSimilarity. But the average term frequency of the
document is missing. Is there a simple way to get or calc this
number?

regards,
jochen

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message