lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Muir <rcm...@gmail.com>
Subject Re: [JENKINS] Lucene-Solr-Tests-trunk-java7 - Build # 3368 - Still Failing
Date Fri, 09 Nov 2012 12:43:33 GMT
The bug here (in my opinion) is that ThaiWordFilter is a filter at all
(it should be a tokenizer). Like WDF and other filters that really
should be tokenizers, It doesn't expect and can't handle arbitrary
input correctly (e.g. thats been through a shingle filter...)

Another problem is that offsetsAreCorrect=false allows for offsets to
"go backwards" in the stream. But this leniency is a false sense of
security, because if you add a shingle filter then you have a
situation like this where startOffset > endOffset.

On Fri, Nov 9, 2012 at 7:31 AM, Apache Jenkins Server
<jenkins@builds.apache.org> wrote:
> Error Message:
> startOffset must be non-negative, and endOffset must be >= startOffset, startOffset=5,endOffset=3
>
> Stack Trace:
> java.lang.IllegalAr> [junit4:junit4]   2> Exception from random analyzer:
> [junit4:junit4]   2> charfilters=
> [junit4:junit4]   2> tokenizer=
> [junit4:junit4]   2>   org.apache.lucene.analysis.core.WhitespaceTokenizer(LUCENE_50,
org.apache.lucene.analysis.core.TestRandomChains$CheckThatYouDidntReadAnythingReaderWrapper@7f4aaa58)
> [junit4:junit4]   2> filters=
> [junit4:junit4]   2>   org.apache.lucene.analysis.miscellaneous.LengthFilter(false,
org.apache.lucene.analysis.ValidatingTokenFilter@1, -30, 69)
> [junit4:junit4]   2>   org.apache.lucene.analysis.shingle.ShingleFilter(org.apache.lucene.analysis.ValidatingTokenFilter@37caea,
tpzabzsxye)
> [junit4:junit4]   2>   org.apache.lucene.analysis.th.ThaiWordFilter(LUCENE_50, org.apache.lucene.analysis.ValidatingTokenFilter@37caea)
> [junit4:junit4]   2>   org.apache.lucene.analysis.shingle.ShingleFilter(org.apache.lucene.analysis.ValidatingTokenFilter@37caea)
> [junit4:junit4]   2> offsetsAreCorrect=false

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message