lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doron Cohen <DOR...@il.ibm.com>
Subject highlight - scoring fragments with more of the same token
Date Tue, 26 Sep 2006 01:52:02 GMT

This question was raised in the user's list -
http://www.nabble.com/highlighting-tf2322109.html

Assume three fragments and two queries:
  f1 = aa  11  bb  33  cc
  f2 = aa  11  bb  11  cc
  f3 = aa  11  bb  22  cc
  q1 = 11 22
  q2 = 11
Now we call highlighter.getBestFragment(q);
For q1, f3 is returned, as expected.
For q2, f1 is returned, although "11" appears twice in f2 but only once in
f1.

This is because QueryScorer.getTokenScore(Token) counts only unique
fragment tokens.

Would it make sense to make this behavior controllable?
(It is easily done but I am not sure about the consequences.)

Or perhaps there is a way to achieve this behavior (preferring f2 on f1 for
q2 above) that I missed?



---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message