lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "mganesh" <mgan...@newgen.co.in>
Subject Analyzer
Date Fri, 11 Apr 2003 12:45:20 GMT
Hello folks
 I have written my own Analyzer class which should filter the stop words.
 But the stop words are not removed in *.fdt file.

public class MyAnalyzer extends Analyzer
{
    public static final String[] STOP_WORDS =
    {
        "0", "1", "2", "3", "4", "5", "6", "7", "8",
        "9", "000", "$",
        "about", "after", "all", "also", "an", "and",
        "another", "any", "are", "as", "at", "be",
        "because", "been", "before", "being", "between",
        "both", "but", "by", "came", "can", "come",
        "could", "did", "do", "does", "each", "else",
        "for", "from", "get", "got", "has", "had",
        "he", "have", "her", "here", "him", "himself",
        "his", "how","if", "in", "into", "is", "it",
        "its", "just", "like", "make", "many", "me",
        "might", "more", "most", "much", "must", "my",
        "never", "now", "of", "on", "only", "or",
        "other", "our", "out", "over", "re", "said",
        "same", "see", "should", "since", "so", "some",
        "still", "such", "take", "than", "that", "the",
        "their", "them", "then", "there", "these",
        "they", "this", "those", "through", "to", "too",
        "under", "up", "use", "very", "want", "was",
        "way", "we", "well", "were", "what", "when",
        "where", "which", "while", "who", "will",
        "with", "would", "you", "your",
        "a", "b", "c", "d", "e", "f", "g", "h", "i",
        "j", "k", "l", "m", "n", "o", "p", "q", "r",
        "s", "t", "u", "v", "w", "x", "y", "z"
    };

    private static Hashtable _stopTable =
StopFilter.makeStopTable(STOP_WORDS);

     public MyAnalyzer()
    {
       super(STOP_WORDS);
     }
    public MyAnalyzer(String[] stopWords)
    {
        _stopTable = StopFilter.makeStopTable(stopWords);
    }
    public final TokenStream tokenStream(Reader reader)
    {
           System.out.println("Stop Analyzer is called");
          //return new StopFilter(new
LowerCaseTokenizer(reader),STOP_WORDS);

          TokenStream result = new StandardTokenizer(reader);
          result = new StandardFilter(result);
          result = new LowerCaseFilter(result);
          result = new StopFilter(result, _stopTable);
          result = new PorterStemFilter(result);
          return result;
    }
}

If there is any problem with this code please do help me.

regards
ganesh


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message