lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From markharw00d <markharw...@yahoo.co.uk>
Subject Re: Highlighter API
Date Fri, 18 Feb 2005 23:35:01 GMT
>>the Highlighter's getBestFragment method takes a TokenStream and a text. 
>>Wouldn't it be easier to give it just the text and an analyzer

That's how it was originally coded. The move to TokenStream was a deliberate choice, made
in order to decouple the highlighter from the source of tokens and enable alternatives. Re-analyzing
document text with an Analyzer is one (potentially costly) way of getting Tokens. Another
is to use the new TermVector support (see TokenSources.java in the highlighter package). In
my apps I have query processing stages which use TokenStreams to extract themes from result
sets and the output of TokenStreams produced in this stage can usefully be cached and reused
in the highlighting stage.
If ease of use is your concern I would suggest wrapping the highlighter functionality with
a simpler (Analyzer based) interface rather than changing the internals of the highlighter
implementation. That way more experienced users still have the option to use optimized alternatives
in the underlying code.

Cheers,
Mark



Daniel Naber wrote:

>Hi,
>
>the Highlighter's getBestFragment method takes a TokenStream and a text. 
>Wouldn't it be easier to give it just the text and an analyzer so the user 
>doesn't have to care about building a TokenStream? Like this:
>
>public final String getBestFragment(Analyzer analyzer, String text)
>throws IOException
>{
>  TokenStream tokenStream = analyzer.tokenStream("field", new 	
>    StringReader(text));
>  return getBestFragment(tokenStream, text);
>}
>
>The old method could then be deprecated. Or am I missing something? This 
>would also avoid problems in case the stream doesn't match the text.
>
>Regards
> Daniel
>
>  
>



---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org


Mime
View raw message