lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From markharw00d <markharw...@yahoo.co.uk>
Subject Re: Lucene "cuts" the search results ?
Date Tue, 15 Feb 2005 15:27:53 GMT
Hi Pierre,
Here's the response I gave the last time this question was raised::

The highlighter uses a number of "pluggable" services, one of which is the
choice of "Fragmenter" implementation. This interface is for classes which
decide the boundaries where to cut the original text into snippets. The 
default
implementation used simply breaks up text into evenly sized chunks. A more
intelligent implementation could be made to detect sentence boundaries.
What you are asking for requires that the Fragmenter would know where the
upcoming query matches are and decides on fragment boundaries with this in
mind. To have this foresight would require a preliminary pass over the
TokenStream to identify the match points before calling the highlighter.

This Fragmenter implementation does not exist but it does not sound
unachievable. I would suggest that some knowledge of sentence boundaries
probably would probably help here too. I dont have any plans to write such a
Fragmenter now but this is how it could be done.

Hope this helps,
Cheers,
Mark



---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message