Thanks for pointing out this issue.
The bug was related to having a doc bigger than the maxNumDocsToAnalyze
setting. In this situation, the last fragment created was always sized
from maxNumDocsToAnalyze position to the remainder of the doc (in your
case, quite large!)
I have fixed this in SVN and Junit tests are clean. If you want to patch
your version comment out these lines in the Highlighter.java code
// append text after end of last token
if (lastEndOffset < text.length())
newText.append(encoder.encodeText(text.substring(lastEndOffset)));
I believe the above code was trying to retain any non-token text after
the last token eg appending ?! or . so I have removed it to be on the
safe side.
Cheers
Mark
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
|