lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mike Barry <>
Subject Re: Implicit Stopping in StandardTokenizer??
Date Mon, 20 Jun 2005 14:53:21 GMT
Max Pfingsthorn wrote:

>I've been trying to make an Analyzer which works like the StandardAnalyzer but without
stopping. For some reason though, I still don't get words like "is" or "a" out of it... I
checked with Luke (one doc in one index with the contents "hello,this,is,a,keyword,hello!,nicetomeetyou".
This should tokenize into "hello this is a keyword hello nicetomeetyou", but actually it does
"hello keyword hello nicetomeetyou". Does anyone know why it drops those extra terms?
>Best regards,
>Max Pfingsthorn
StandardAnaylzer has a constructor which allows you to send your own
array of stop words. So an array with zero elements should do the

String[] stopWords        = new String[0];
StandardAnalyzer analyzer = new StandardAnalyzer(stopWords);


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message