lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Patrick Jungermann (JIRA)" <j...@apache.org>
Subject [jira] Commented: (LUCENE-1939) IndexOutOfBoundsException at ShingleMatrixFilter's Iterator#hasNext method
Date Sat, 03 Oct 2009 16:25:23 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-1939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12761907#action_12761907
] 

Patrick Jungermann commented on LUCENE-1939:
--------------------------------------------

Karl, your right, sorry. I used the current release of Solr, version 1.3.0, that's using Lucene
2.4.1. Solr 1.4 that will be released soon is using Lucene 2.9. For me, it seems that filter
did not changed at the causing code lines. But I don't know, if this is the real root cause.

Now, I have tested this also with the current trunk of Solr that is already using Lucene 2.9.
At first I tried a simple example with an analyzing workflow based on the WhitespaceTokenizer
followed by the ShingleMatrixFilter and no problem occured.

Then, I tried the other field type configuration, that I had used at the former test and the
exception was thrown.

{code}
Caused by: java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
	at java.util.ArrayList.RangeCheck(Unknown Source)
	at java.util.ArrayList.get(Unknown Source)
	at org.apache.lucene.analysis.shingle.ShingleMatrixFilter$Matrix$1.hasNext(ShingleMatrixFilter.java:841)
	at org.apache.lucene.analysis.shingle.ShingleMatrixFilter.produceNextToken(ShingleMatrixFilter.java:485)
	at org.apache.lucene.analysis.shingle.ShingleMatrixFilter.incrementToken(ShingleMatrixFilter.java:372)
	at org.apache.lucene.analysis.TokenStream.next(TokenStream.java:401)
	at org.apache.lucene.analysis.shingle.ShingleMatrixFilter.next(ShingleMatrixFilter.java:405)
	...
{code}

To find the reason of it, I removed filter by filter. After a lot of tests, I found out that
the problem was caused by the use of
# WhitespaceTokenizer
# ShingleMatrixFilter
# RemoveDuplicatesTokenFilter
that were used in that order. If I changed the positions of both filters, all seems to work
okay.

This time, I tested this only with the field analysis view with different data

Also, it was really strange, that the exception only occured at the first analysis request,
and extremely rarly a second time. But it was thrown at every first request.

> IndexOutOfBoundsException at ShingleMatrixFilter's Iterator#hasNext method
> --------------------------------------------------------------------------
>
>                 Key: LUCENE-1939
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1939
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: contrib/analyzers
>    Affects Versions: 2.9
>            Reporter: Patrick Jungermann
>            Assignee: Karl Wettin
>         Attachments: ShingleMatrixFilter_IndexOutOfBoundsException.patch
>
>
> I tried to use the ShingleMatrixFilter within Solr. To test the functionality etc., I
first used the built-in field analysis view.The filter was configured to be used only at query
time analysis with "_" as spacer character and a min. and max. shingle size of 2. The generation
of the shingles for query strings with this filter seems to work at this view, but by turn
on the highlighting of indexed terms that will match the query terms, the exception was thrown.
Also, each time I tried to query the index the exception was immediately thrown.
> Stacktrace:
> {code}
> java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
> 	at java.util.ArrayList.RangeCheck(Unknown Source)
> 	at java.util.ArrayList.get(Unknown Source)
> 	at org.apache.lucene.analysis.shingle.ShingleMatrixFilter$Matrix$1.hasNext(ShingleMatrixFilter.java:729)
> 	at org.apache.lucene.analysis.shingle.ShingleMatrixFilter.next(ShingleMatrixFilter.java:380)
> 	at org.apache.lucene.analysis.StopFilter.next(StopFilter.java:120)
> 	at org.apache.lucene.analysis.TokenStream.next(TokenStream.java:47)
> 	...
> {code}
> Within the hasNext method, there is the {{s-1}}-th Column from the ArrayList {{columns}}
requested, but there isn't this entry within columns.
> I created a patch that checks, if {{columns}} contains enough entries.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message