lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael Pichler (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (LUCENE-7437) QueryParser with wildcard search does not use Analyzer's tokenizer
Date Wed, 07 Sep 2016 12:51:20 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-7437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15470484#comment-15470484
] 

Michael Pichler edited comment on LUCENE-7437 at 9/7/16 12:50 PM:
------------------------------------------------------------------

Hello Uwe,
thanks for the quick reply!

As we do not use stemming on indexing, this would be a suitable option for us (in fact we
already override getPrefixQuery and getFuzzyQuery in our custom QueryParser like done in the
AnalyzingQueryParser).

I tried to use {{AnalyzingQueryParser}} in the test program and got the *same result* (thus
I re-opened the issue). Even though the analyzer gets involved for normalization, it is still
not used for tokenizing the input.

{code}
    // QueryParser queryParser = new QueryParser(FIELD_NAME, analyzer);
    QueryParser queryParser = new AnalyzingQueryParser(FIELD_NAME, analyzer);
    System.err.println("using QueryParser " + queryParser.getClass());
{code}

{noformat}
using QueryParser class org.apache.lucene.queryparser.analyzing.AnalyzingQueryParser
search: 'qwert asdf*', query: '+f:qwert +f:asdf*', #hits: 1
search: 'qwert_asdf*', query: 'f:qwert_asdf*', #hits: 0
  ^^^ expected 1 hit(s), got 0
search: 'qwert%asdf*', query: 'f:qwert%asdf*', #hits: 0
  ^^^ expected 1 hit(s), got 0
{noformat}



was (Author: mpichler):
Hello Uwe,
thanks for the quick reply!

As we do not use stemming on indexing, this would be a suitable option for us (in fact we
already override getPrefixQuery and getFuzzyQuery in our custom QueryParser like done in the
AnalyzingQueryParser).

I tried to use {{AnalyzingQueryParser}} in the test program and got the *same result* (thus
I re-opened the issue). Even though the analyzer gets involved, it is still not used for tokenizing
the input.

{code}
    // QueryParser queryParser = new QueryParser(FIELD_NAME, analyzer);
    QueryParser queryParser = new AnalyzingQueryParser(FIELD_NAME, analyzer);
    System.err.println("using QueryParser " + queryParser.getClass());
{code}

{noformat}
using QueryParser class org.apache.lucene.queryparser.analyzing.AnalyzingQueryParser
search: 'qwert asdf*', query: '+f:qwert +f:asdf*', #hits: 1
search: 'qwert_asdf*', query: 'f:qwert_asdf*', #hits: 0
  ^^^ expected 1 hit(s), got 0
search: 'qwert%asdf*', query: 'f:qwert%asdf*', #hits: 0
  ^^^ expected 1 hit(s), got 0
{noformat}


> QueryParser with wildcard search does not use Analyzer's tokenizer
> ------------------------------------------------------------------
>
>                 Key: LUCENE-7437
>                 URL: https://issues.apache.org/jira/browse/LUCENE-7437
>             Project: Lucene - Core
>          Issue Type: Bug
>          Components: core/queryparser
>    Affects Versions: 6.2
>            Reporter: Michael Pichler
>            Assignee: Uwe Schindler
>         Attachments: LuceneTest.java
>
>
> Using a tokenizer that splits at underscores (e.g. SimpleAnalyzer) splits "qwert_asdfghjkl"
into two words at the time of indexing.
> Searches for "qwert asdf*" or "qwert_asdfghjkl" work as expected.
> However, when a query contains wildcards, e.g. "qwert_asdf*" the query parser does not
use the tokenizer of its analyzer to split the words and thus finds no result.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message