I found this page extremely helpful in finding out EXACTLY what Lucene is
doing (and how, if I wanted to, to change it). Like Erik said, it does
pretty darn well just as it is. I'm not sure if anyone has already pointed
you to this page yet.
You'll have to spend some time diving down in to each component of the
score, as it's composed of a number of different factors, but it's all
described on this page, and it tells you what the default behavior is
using DefaultSimilarity.
http://lucene.zones.apache.org:8080/hudson/job/Lucene-Nightly/javadoc/org/apache/lucene/search/Similarity.html
Donna L. Gresh
Services Research, Mathematical Sciences Department
IBM T.J. Watson Research Center
(914) 945-2472
http://www.research.ibm.com/people/g/donnagresh
gresh@us.ibm.com
|