lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Karl Wettin (JIRA)" <j...@apache.org>
Subject [jira] Commented: (LUCENE-1939) IndexOutOfBoundsException at ShingleMatrixFilter's Iterator#hasNext method
Date Sat, 03 Oct 2009 17:41:23 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-1939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12761924#action_12761924
] 

Karl Wettin commented on LUCENE-1939:
-------------------------------------

The exception is thrown when ts#next (incrementToken) is called again after already having
returned null (false) once. So this is a nice catch!

But this means that RemoveDuplicatesTokenFilter in Solr calls incrementToken one extra time
for some reason. Can you please post the complete stacktrace so I can take a look in there
too? 

I suppose the expected behaviour would be that a token stream keep returning false when incrementToken
is called upon after returning false already, but the javadocs doesn't  really say anything
about this, nor is there a generic test case that ensure this for all filters. Thus this error
might be available in other filters. I'll see if I can do something about that before committing.

Thanks for the report Patrick!

> IndexOutOfBoundsException at ShingleMatrixFilter's Iterator#hasNext method
> --------------------------------------------------------------------------
>
>                 Key: LUCENE-1939
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1939
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: contrib/analyzers
>    Affects Versions: 2.9
>            Reporter: Patrick Jungermann
>            Assignee: Karl Wettin
>         Attachments: ShingleMatrixFilter_IndexOutOfBoundsException.patch
>
>
> I tried to use the ShingleMatrixFilter within Solr. To test the functionality etc., I
first used the built-in field analysis view.The filter was configured to be used only at query
time analysis with "_" as spacer character and a min. and max. shingle size of 2. The generation
of the shingles for query strings with this filter seems to work at this view, but by turn
on the highlighting of indexed terms that will match the query terms, the exception was thrown.
Also, each time I tried to query the index the exception was immediately thrown.
> Stacktrace:
> {code}
> java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
> 	at java.util.ArrayList.RangeCheck(Unknown Source)
> 	at java.util.ArrayList.get(Unknown Source)
> 	at org.apache.lucene.analysis.shingle.ShingleMatrixFilter$Matrix$1.hasNext(ShingleMatrixFilter.java:729)
> 	at org.apache.lucene.analysis.shingle.ShingleMatrixFilter.next(ShingleMatrixFilter.java:380)
> 	at org.apache.lucene.analysis.StopFilter.next(StopFilter.java:120)
> 	at org.apache.lucene.analysis.TokenStream.next(TokenStream.java:47)
> 	...
> {code}
> Within the hasNext method, there is the {{s-1}}-th Column from the ArrayList {{columns}}
requested, but there isn't this entry within columns.
> I created a patch that checks, if {{columns}} contains enough entries.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message