lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Spyros Kapnissis <>
Subject WhitespaceTokenizer 4.0 issue
Date Thu, 08 Nov 2012 13:20:46 GMT

Noticed the following issue during our recent code migration to LUCENE_40. The test below
will fail with an ArrayIndexOutOfBoundsException -1.  It will pass only if tokenizer.reset()
is called before incrementing the tokens. 

public void whitespaceTokTest() throws IOException {

String text = "a b c d";
Tokenizer tokenizer = new WhitespaceTokenizer(Version.LUCENE_40, new StringReader(text));
List<String> tokens = new ArrayList<String>();
while (tokenizer.incrementToken()) {
assertEquals(tokens, Arrays.asList(new String[]{"a","b","c","d"}));


This used to work, at least until LUCENE_33. Is this a bug, or am I missing something? 

Thank you, 
View raw message