lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erick Erickson" <erickerick...@gmail.com>
Subject Re: Highlighter returning incomplete field text
Date Fri, 09 Feb 2007 17:41:30 GMT
Also, there's a default of 10,000 tokens per field at index time....

Erick

On 2/9/07, mark harwood <markharw00d@yahoo.co.uk> wrote:
>
> See Highlighter.setMaxDocBytesToAnalyze(int byteCount)
>
> It's default setting is limited in order to avoid excessive response
> times.
>
> Cheers
> Mark
>
>
> ----- Original Message ----
> From: Fred Eaker <fredeaker@gmail.com>
> To: java-user@lucene.apache.org
> Sent: Friday, 9 February, 2007 4:28:36 PM
> Subject: Highlighter returning incomplete field text
>
> Is there a limit to how many characters a Highlighter or NullFragmenter
> will
> return?
>
> I have indexed an entire HTML document (145kb). When I use the highlighter
> with
> a NullFragmenter, the getBestFragment and getBestFragments methods return
> the
> text of the field up to 51316 characters.
>
> I have tried indexing other HTML documents as well, but get the same
> results.
>
> If I change the Highlighter's Encoder to DefaultEncoder, I get more
> characters,
> but not the entire field.
>
> Here is some code:
>
> Highlighter highlighter =
> new Highlighter(new SimpleHTMLFormatter(),
> new DefaultEncoder(),
> new QueryScorer(query));
>
> highlighter.setTextFragmenter(new NullFragmenter());
>
> TokenStream tokenStream =
> LuceneUtils.getAnalyzer().tokenStream(
> fieldName,
> new StringReader(hit.get(fieldName)));
>
> String highlightedHit =
> highlighter.getBestFragment(tokenStream, hit.get(fieldName));
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
>
>
>
>
>
> ___________________________________________________________
> The all-new Yahoo! Mail goes wherever you go - free your email address
> from your Internet provider. http://uk.docs.yahoo.com/nowyoucan.html
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message