lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Hostetter <hossman_luc...@fucit.org>
Subject Re: phrase search with custom TokenFilter
Date Wed, 19 Mar 2008 00:20:20 GMT

You're going to want to change your TokenFilter so that it emits the split 
pieces tokens immediately after the original token and with a 
positionIncrement of "0" .. don't buffer then up and wait for the entire 
stream to finish first.

it true order of the tokens in the tokenstream and the positionIncrement 
are what matter when doing a PhraseQuery -- not the start/end offsets

Incidently: you might want to take a look at Solr's WordDelimiterFilter, 
both as an example of how to do this, and because it may already meet all 
the needs you've anticipated and some you might not have thought of but 
might want to use once you take a look at them...

http://svn.apache.org/viewvc/lucene/solr/trunk/src/java/org/apache/solr/analysis/WordDelimiterFilter.java?view=markup
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#WordDelimiterFilter



-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message