lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Trejkaz <trej...@trypticon.org>
Subject Bug? "TokenStream contract violation: close() call missing", but close() call was not actually missing
Date Tue, 26 May 2015 01:44:53 GMT
Hi all.

I found a bug where Tokenizer seems to be complaining about close()
not being called, but on further investigation it looks like we're not
the ones who are opening it, but rather it's being opened by Lucene.

Firstly, the tests rely on this method, which may or may not be
correct, but I tried to follow the docs:

    private List<String> consume(TokenStream stream) throws Exception {
        ImmutableList.Builder<String> tokens = ImmutableList.builder();

        //The consumer calls reset().
        stream.reset();
        try {
            //The consumer retrieves attributes from the stream and
stores local references to all attributes it wants to access.
            CharTermAttribute termAttribute =
stream.getAttribute(CharTermAttribute.class);
            //The consumer calls incrementToken() until it returns
false consuming the attributes after each call.
            while (stream.incrementToken()) {
                tokens.add(termAttribute.toString());
            }
            //The consumer calls end() so that any end-of-stream
operations can be performed.
            stream.end();
        } finally {
            //The consumer calls close() to release any resource when
finished using the TokenStream.
            stream.close();
        }

        return tokens.build();
    }

I also tested reusing the stream for multiple readers, which seems to work:

    @Test
    public void testStreamReuse() throws Exception {
        Tokenizer stream = new StandardTokenizer(new StringReader("reader #1"));
        assertThat(consume(stream), contains("reader", "1"));

        stream.setReader(new StringReader("reader 2"));
        assertThat(consume(stream), contains("reader", "2"));
    }

But if the reader throws an exception:

    @Test
    public void testStreamReuseAfterFailure() throws Exception {
        class FailingReader extends Reader {
            @Override
            public int read(@NotNull char[] buffer, int off, int len)
throws IOException {
                throw new IOException("Synthetic exception");
            }

            @Override
            public void close() throws IOException {
                throw new IOException("Synthetic exception");
            }
        }

        Tokenizer stream = new StandardTokenizer(new FailingReader());
        try {
            consume(stream);
            fail("Expected IOException");
        } catch (IOException e) {
            // Expected
        }

        stream.setReader(new StringReader("working reader"));
        assertThat(consume(stream), contains("working", "reader"));
    }

This fails:

    java.lang.IllegalStateException: TokenStream contract violation:
close() call missing
            at org.apache.lucene.analysis.Tokenizer.setReader(Tokenizer.java:90)
            at TestStandardTokenizerStandalone.testStreamReuseAfterFailure(TestStandardTokenizerStandalone.java:64)

As far as I can see from the code, I am calling close() correctly.
Maybe I'm not, though, so is this a bug in my code or a bug in Lucene?

TX

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message