lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jake Clawson <>
Subject Lucene 5.5.0 StopFilter Error
Date Thu, 25 Feb 2016 21:43:30 GMT
I am trying to use StopFilter in Lucene 5.5.0. I tried the following:

package lucenedemo;

import java.util.ArrayList;
import java.util.Arrays;
import java.util.Collections;
import java.util.HashSet;
import java.util.List;
import java.util.Set;
import java.util.Iterator;

import org.apache.lucene.*;
import org.apache.lucene.analysis.*;
import org.apache.lucene.analysis.standard.*;
import org.apache.lucene.analysis.core.StopFilter;
import org.apache.lucene.analysis.en.EnglishAnalyzer;
import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.analysis.standard.StandardTokenizer;
import org.apache.lucene.analysis.tokenattributes.CharTermAttribute;
import org.apache.lucene.analysis.util.CharArraySet;
import org.apache.lucene.util.AttributeFactory;
import org.apache.lucene.util.Version;

public class lucenedemo {

public static void main(String[] args) throws Exception {
System.out.println(removeStopWords("hello how are you? I am fine. This is a great day!"));


public static String removeStopWords(String strInput) throws Exception {
AttributeFactory factory = AttributeFactory.DEFAULT_ATTRIBUTE_FACTORY;
StandardTokenizer tokenizer = new StandardTokenizer(factory);
tokenizer.setReader(new StringReader(strInput));
CharArraySet stopWords = EnglishAnalyzer.getDefaultStopSet();

TokenStream streamStop = new StopFilter(tokenizer, stopWords);
StringBuilder sb = new StringBuilder();
CharTermAttribute charTermAttribute = tokenizer.addAttribute(CharTermAttribute.class);
while (streamStop.incrementToken()) {
String term = charTermAttribute.toString();
sb.append(term + " ");



return sb.toString();



But it gives me the following error:

Exception in thread "main" java.lang.IllegalStateException: TokenStream contract violation:
reset()/close() call missing, reset() called multiple times, or subclass does not call super.reset().
Please see Javadocs of TokenStream class for more information about the correct consuming
at org.apache.lucene.analysis.Tokenizer$
at org.apache.lucene.analysis.standard.StandardTokenizerImpl.zzRefill(
at org.apache.lucene.analysis.standard.StandardTokenizerImpl.getNextToken(
at org.apache.lucene.analysis.standard.StandardTokenizer.incrementToken(
at org.apache.lucene.analysis.util.FilteringTokenFilter.incrementToken(
at lucenedemo.lucenedemo.removeStopWords(
at lucenedemo.lucenedemo.main(

What exactly am I doing wrong here? I have closed both the Tokenizer and TokenStream clasess.
Is there something else I am missing here?

Any help would be greatly appreciated.

Jake Clawson

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message