lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Fred Eaker <fredea...@gmail.com>
Subject Highlighter returning incomplete field text
Date Fri, 09 Feb 2007 16:28:36 GMT
Is there a limit to how many characters a Highlighter or NullFragmenter will
return?

I have indexed an entire HTML document (145kb). When I use the highlighter with
a NullFragmenter, the getBestFragment and getBestFragments methods return the
text of the field up to 51316 characters.

I have tried indexing other HTML documents as well, but get the same results.

If I change the Highlighter's Encoder to DefaultEncoder, I get more characters,
but not the entire field.

Here is some code:

Highlighter highlighter =
new Highlighter(new SimpleHTMLFormatter(),
new DefaultEncoder(),
new QueryScorer(query));

highlighter.setTextFragmenter(new NullFragmenter());

TokenStream tokenStream =
LuceneUtils.getAnalyzer().tokenStream(
fieldName,
new StringReader(hit.get(fieldName)));

String highlightedHit =
highlighter.getBestFragment(tokenStream, hit.get(fieldName));


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message