lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bjarke Buur Mortensen <morten...@eluence.com>
Subject Re: Complexphrase treats wildcards differently than other query parsers
Date Fri, 06 Oct 2017 10:45:51 GMT
Thanks a lot for your effort, Tim.

Looking at it from the Solr side, I see some use of local classes. The
snippet below in particular caught my eye (in
solr/core/src/java/org/apache/solr/search/ComplexPhraseQParserPlugin.java).
The instance of ComplexPhraseQueryParser is not the clean one from Lucene,
but a modified one. If any of the modifications messes with the analysis
logic, well then that might answer it.

What do you make of it?

lparser = new ComplexPhraseQueryParser(defaultField, getReq().getSchema().
getQueryAnalyzer())
{
protected Query newWildcardQuery(org.apache.lucene.index.Term t) {
try {
org.apache.lucene.search.Query wildcardQuery = reverseAwareParser.
getWildcardQuery(t.field(), t.text());
setRewriteMethod(wildcardQuery);
return wildcardQuery;
} catch (SyntaxError e) {
throw new RuntimeException(e);
}
}
private Query setRewriteMethod(org.apache.lucene.search.Query query) {
if (query instanceof MultiTermQuery) {
((MultiTermQuery) query).setRewriteMethod(
org.apache.lucene.search.MultiTermQuery.SCORING_BOOLEAN_REWRITE);
}
return query;
}
protected Query newRangeQuery(String field, String part1, String part2,
boolean startInclusive,
boolean endInclusive) {
boolean reverse = reverseAwareParser.isRangeShouldBeProtectedFromReverse(field,
part1);
return super.newRangeQuery(field,
reverse ? reverseAwareParser.getLowerBoundForReverse() : part1,
part2,
startInclusive || reverse,
endInclusive);
}
}
;

Thanks,
Bjarke

2017-10-05 21:15 GMT+02:00 Allison, Timothy B. <tallison@mitre.org>:

> After some more digging, I'm wrong even at the Lucene level.
>
> When I use the CustomAnalyzer and make my UC vowel mock filter
> MultitermAware, I get this with Lucene in trunk:
>
> "the* quick~" name:thE* name:qUIck~2 name:thE name:qUIck
>
> So, there's room for improvement with phrases, but the regular multiterms
> should be ok.
>
> Still no answer for you...
>
> 2017-10-05 14:34 GMT+02:00 Allison, Timothy B. <tallison@mitre.org>:
>
> > There's every chance that I'm missing something at the Solr level, but
> > it _looks_ at the Lucene level, like ComplexPhraseQueryParser is still
> > not applying analysis to multiterms.
> >
> > When I call this on 7.0.0:
> >    QueryParser qp = new ComplexPhraseQueryParser(defaultFieldName,
> > analyzer);
> >     return qp.parse(qString);
> >
> >  where the analyzer is a mock "uppercase vowel" analyzer[1] and the
> > qString is;
> >
> > "the* quick~" the* quick~ the quick
> >
> > I get this:
> > "the* quick~" name:the* name:quick~2 name:thE name:qUIck
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message