lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erik Hatcher <e...@ehatchersolutions.com>
Subject Re: Question on the Sandbox Highlighter
Date Tue, 05 Jul 2005 21:23:20 GMT

On Jul 5, 2005, at 4:58 PM, Terence Lai wrote:
> I am currently using Lucene 1.4.2 with the highighter downloaded  
> from Lucene In Action.
>
> The Highlighter class provides the following method to highlight  
> the terms specified in the Query:
>
> /**
>  * Highlights chosen terms in a text, extracting the most relevant  
> section.
>  * The document text is analysed in chunks to record hit statistics
>  * across the document. After accumulating stats, the fragment with  
> the highest score
>  * is returned
>  *
>  * @param tokenStream   a stream of tokens identified in the text  
> parameter, including offset information.
>  * This is typically produced by an analyzer re-parsing a document's
>  * text. Some work may be done on retrieving TokenStreams more  
> efficently
>  * by adding support for storing original text position data in the  
> Lucene
>  * index but this support is not currently available (as of Lucene  
> 1.4 rc2).
>  * @param text text to highlight terms in
>  *
>  * @return highlighted text fragment or null if no terms found
>  */
> public final String getBestFragment(TokenStream tokenStream, String  
> text)
>        throws IOException;
>
>
> According to the javadoc, this method only returns the most  
> relevant section of the text. Is there any way or method to return  
> ENTIRED text with the terms being highlighted?

Yes - it relies on the Fragmenter.  For lucenebook.com, for example,  
if a search result is for a blog entry, the entire contents are  
highlighted using a NullFragmenter:

package lia.web;

import org.apache.lucene.search.highlight.Fragmenter;
import org.apache.lucene.analysis.Token;

public class NullFragmenter implements Fragmenter {
   public void start(String s) {
   }

   public boolean isNewFragment(Token token) {
     return false;
   }
}


     Erik


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message