lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adrien Grand <jpou...@gmail.com>
Subject Re: null Query from MultiFieldQueryParser.getFieldQuery
Date Thu, 29 Sep 2016 15:24:49 GMT
I'm not very familiar with this part of the code base so I could easily
overlook something. Maybe you can open a JIRA and attach a minimal test
case that reproduces the issue?

Le lun. 19 sept. 2016 à 13:48, Oliver Kaleske <Oliver.Kaleske@ptvgroup.com>
a écrit :

> Hi,
>
> in updating Lucene from 6.1.0 to 6.2.0 I came across the following:
>
> We have a subclass of MultiFieldQueryParser (MFQP) for creating a custom
> type of Query, which calls getFieldQuery() on its base class (MFQP).
> For each of its search fields, this method has a Query created by calling
> getFieldQuery() on QueryParserBase.
> Ultimately, we wind up in QueryBuilder's createFieldQuery() method, which
> depending on the number of tokens (etc.) decides what type of Query to
> return: a TermQuery, BooleanQuery, PhraseQuery, or MultiPhraseQuery.
>
> Back in MFQP.getFieldQuery(), a variable maxTerms is determined depending
> on the type of Query returned: for a TermQuery or a BooleanQuery, its value
> will in general be nonzero, clauses are created, and a non-null Query is
> returned.
> However, other Query subclasses result in maxTerms=0, an empty list of
> clauses, and finally null is returned.
>
> To me, this seems like a bug, but I might as well be missing something.
> The comment "// happens for stopwords" on the return null statement,
> however, seems to suggest that Query types other than TermQuery and
> BooleanQuery were not considered properly here.
> I should point out that our custom MFQP subclass so far does some rather
> unsophisticated tokenization before calling getFieldQuery() on each token,
> so characters like '*' may still slip through. So perhaps with proper
> tokenization, it is guaranteed that only TermQuery and BooleanQuery can
> come out of the chain of getFieldQuery() calls, and not handling
> (Multi)PhraseQuery in MFQP.getFieldQuery() can never cause trouble?
>
> The code in MFQP.getFieldQuery dates back to
> LUCENE-2605: Add classic QueryParser option setSplitOnWhitespace() to
> control whether to split on whitespace prior to text analysis.  Default
> behavior remains unchanged: split-on-whitespace=true.
> (06 Jul 2016), when it was substantially expanded.
>
> Best regards,
> Oliver
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message