lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Koji Sekiguchi <>
Subject Re: Snippets and Boundaryscanner in Highlighter
Date Fri, 23 Sep 2011 01:02:54 GMT
(11/09/23 8:57), O. Klein wrote:
> The content_text field is filled with text from pdf's. So this is not the
> problem. Besides the regex fragmenter gives back multiple snippets like
> expected.

This doesn't show that BoundaryScanner has the bug. Highlighter's fragmenter
and FVH FragmentsBuilder are totally different.

> Have you tested to see if a boundaryscanner of type LINE gives back multiple
> snippets with your content?

No, I haven't. Do you mean LINE type causes the problem? Can you get two snippets
if you use WORD type BreakIteratorBoundaryScanner?

You can implement your own BoundaryScanner instead, if you think
LINE BreakIterator doesn't work as you expected.

Check out "Query Log Visualizer" for Apache Solr

View raw message