lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mike Barry <MBa...@cos.com>
Subject Re: Implicit Stopping in StandardTokenizer??
Date Mon, 20 Jun 2005 14:53:21 GMT
Max Pfingsthorn wrote:

>Hi!
>
>I've been trying to make an Analyzer which works like the StandardAnalyzer but without
stopping. For some reason though, I still don't get words like "is" or "a" out of it... I
checked with Luke (one doc in one index with the contents "hello,this,is,a,keyword,hello!,nicetomeetyou".
This should tokenize into "hello this is a keyword hello nicetomeetyou", but actually it does
"hello keyword hello nicetomeetyou". Does anyone know why it drops those extra terms?
>
>Best regards,
>
>Max Pfingsthorn
>
>Hippo  
>  
>
StandardAnaylzer has a constructor which allows you to send your own
array of stop words. So an array with zero elements should do the
trick:


String[] stopWords        = new String[0];
StandardAnalyzer analyzer = new StandardAnalyzer(stopWords);

-MikeB.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message