lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dan Climan" <dcli...@keepmedia.com>
Subject Highlighter, Term Positions and Stopwords
Date Tue, 06 Dec 2005 04:32:12 GMT
Do stopfilters create non-contiguous token positions?
 
I was interested in experimenting with the highlighter and using the
TokenSources.getTokenStream(TermPositionVector
<file:///C:\mysvn\lucene\build\docs\api\org\apache\lucene\index\TermPosition
Vector.html>  tpv,                                       boolean
tokenPositionsGuaranteedContiguous) method
 
The javadocs for this method note that:

tokenPositionsGuaranteedContiguous - true if the token position numbers have
no overlaps or gaps.

 

The example used for comparison to re-Analyzing the the text includes
stopwords ("timings above were using a stemmer/lowercaser/stopword combo").

I was curious if a stopwords, by definition meant that tokens were not
contiguous? Is this still true if the the query uses the same analyzer and
filters out the same stopwords?

 

Thanks,

Dan


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message