lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: StandardFilter not handling dots as exptected ?
Date Thu, 06 Aug 2009 14:27:48 GMT
I don't see anything obvious in the code.
Are you using the same analzer at query time as at index time?
I'd also get a copy of Luke and examine your index to see what
is actually getting put in it, and query.toString might help.

Best
Erick

On Thu, Aug 6, 2009 at 10:03 AM, Paul Taylor <paul_t100@fastmail.fm> wrote:

>
> Hi want the query "R.E.S" to match "R.E.S"
>
> I use StandardFilter in my analyzer below and the description says:
>
>   'Splits words at punctuation characters, removing punctuation. However, a
> dot that's not followed by whitespace is considered part of a token. '
>
> so I thought that R.E.S. would become searchable as R.E.S, and the search
> would work, but it doesn't whereas searching for "R.E.S" does return a hit .
>
> thanks Paul
>
> public class StandardUnaccentAnalyzer extends Analyzer {
>
>   public TokenStream tokenStream(String fieldName, Reader reader) {
>       StandardTokenizer tokenStream = new StandardTokenizer(reader);
>       TokenStream result = new StandardFilter(tokenStream);
>       result = new LowerCaseFilter(result);
>       return result;
>   }
>     private static final class SavedStreams {
>       StandardTokenizer tokenStream;
>       TokenStream filteredTokenStream;
>   }
>     public TokenStream reusableTokenStream(String fieldName, Reader reader)
> throws IOException {
>       SavedStreams streams = (SavedStreams)getPreviousTokenStream();
>       if (streams == null) {
>           streams = new SavedStreams();
>           setPreviousTokenStream(streams);
>           streams.tokenStream = new StandardTokenizer(reader);
>           streams.filteredTokenStream = new
> StandardFilter(streams.tokenStream);
>           streams.filteredTokenStream = new
> LowerCaseFilter(streams.filteredTokenStream);
>       }
>       else {
>           streams.tokenStream.reset(reader);
>       }
>       return streams.filteredTokenStream;
>   }
>
> }
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message