lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From markharw00d <>
Subject Re: highlight - scoring fragments with more of the same token
Date Tue, 26 Sep 2006 22:29:29 GMT
>>I was somewhat surprised to find that highlighting scoring simply counts
>>how many unique query terms appear in the fragment. Guess was expecting a

See QueryScorer(Query query, IndexReader reader, String fieldName) constructor - this will
factor IDF into weighting for terms. Query boosts are automatically factored in too.
TF is not a factor in fragment scores because I found its typically more useful to look for
fragments containing a strong mix of the query terms - not merely repetitions of the same
term. The idea is the choice of scorer is pluggable if you don't like the default behaviour.

The possibility of adding smarter fragmenting is also enabled by the interface for Fragmenter
- no "smarter" alternatives to the simple one have been implemented as yet though (as far
as I am aware).


Win a BlackBerry device from O2 with Yahoo!. Enter now.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message