lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Polina Litvak" <plit...@casebank.com>
Subject NullAnalyzer still tokenizes fields
Date Fri, 09 Jul 2004 14:53:47 GMT
I tried to create my own analyzer so it returns fields as they are
(without any tokenizing done), using code posted on lucene-user a short
while a go:
 
private static class NullAnalyzer
   extends Analyzer
    {
      public TokenStream tokenStream(String fieldName, Reader reader)
      {
        return new CharTokenizer(reader)
        {
          protected boolean isTokenChar(char c)
          {
            return true;
          }
        };
     }
 }
 
 
After testing this analyzer I found out that fields I pass to it still
get tokenized. 
E.g. I have a field with value ABCD-EF. When passing it through the
analyzer, the only characters that end up in isTokenChar() are "A, B, C,
D, E, F". So looks like "-" gets filtered out before it even gets to
isTokenChar().
 
Did anyone encounter this problem ? 
Any help will be greatly appreciated!
 
Thanks,
Polina 
 
 

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message