lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erik Hatcher <>
Subject Re: Implicit Stopping in StandardTokenizer??
Date Mon, 20 Jun 2005 14:56:49 GMT

On Jun 20, 2005, at 10:41 AM, Max Pfingsthorn wrote:

> Hi!
> I've been trying to make an Analyzer which works like the  
> StandardAnalyzer but without stopping. For some reason though, I  
> still don't get words like "is" or "a" out of it... I checked with  
> Luke (one doc in one index with the contents  
> "hello,this,is,a,keyword,hello!,nicetomeetyou". This should  
> tokenize into "hello this is a keyword hello nicetomeetyou", but  
> actually it does "hello keyword hello nicetomeetyou". Does anyone  
> know why it drops those extra terms?

Show us the code of your analyzer.

If all you want is StandardAnalyzer to not remove stop words, you can  
construct it this way:

     analyzer = new StandardAnalyzer(new String[] {});

The String[] are the stop words to remove, in this case none.


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message