Mailing-List: contact dev-help@lucene.apache.org; run by ezmlm
Precedence: bulk
Reply-To: dev@lucene.apache.org
Date: Wed, 7 Sep 2016 12:33:21 +0000 (UTC)
From: "Michael Pichler (JIRA)" <jira@apache.org>
To: dev@lucene.apache.org
Message-ID: <JIRA.13003181.1473245584000.507179.1473251601058@Atlassian.JIRA>
In-Reply-To: <JIRA.13003181.1473245584000@Atlassian.JIRA>
References: <JIRA.13003181.1473245584000@Atlassian.JIRA> <JIRA.13003181.1473245584305@arcas>
Subject: [jira] [Comment Edited] (LUCENE-7437) QueryParser with wildcard
 search does not use Analyzer's tokenizer
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
archived-at: Wed, 07 Sep 2016 12:33:23 -0000


    [ https://issues.apache.org/jira/browse/LUCENE-7437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15470484#comment-15470484 ] 

Michael Pichler edited comment on LUCENE-7437 at 9/7/16 12:33 PM:
------------------------------------------------------------------

Hello Uwe,
thanks for the quick reply!

As we do not use stemming on indexing, this would be a suitable option for us (in fact we already override getPrefixQuery and getFuzzyQuery in our custom QueryParser like done in the AnalyzingQueryParser).

I tried to use {{AnalyzingQueryParser}} in the test program and got the *same result* (thus I re-opened the issue). Even though the analyzer gets involved, it is still not used for tokenizing the input.

{code}
    // QueryParser queryParser = new QueryParser(FIELD_NAME, analyzer);
    QueryParser queryParser = new AnalyzingQueryParser(FIELD_NAME, analyzer);
    System.err.println("using QueryParser " + queryParser.getClass());
{code}

{noformat}
using QueryParser class org.apache.lucene.queryparser.analyzing.AnalyzingQueryParser
search: 'qwert asdf*', query: '+f:qwert +f:asdf*', #hits: 1
search: 'qwert_asdf*', query: 'f:qwert_asdf*', #hits: 0
  ^^^ expected 1 hit(s), got 0
search: 'qwert%asdf*', query: 'f:qwert%asdf*', #hits: 0
  ^^^ expected 1 hit(s), got 0
{noformat}


was (Author: mpichler):
Hello Uwe,
thanks for the quick reply!

As we do not use stemming on indexing, this would be a suitable option for us (in fact we already override getPrefixQuery and getFuzzyQuery in our custom QueryParser like done in the AnalyzingQueryParser).

I tried to use {{AnalyzingQueryParser}} in the test program and got the *same result*. Even though the analyzer gets involved, it still is not used for tokenizing the input.

{code}
    // QueryParser queryParser = new QueryParser(FIELD_NAME, analyzer);
    QueryParser queryParser = new AnalyzingQueryParser(FIELD_NAME, analyzer);
    System.err.println("using QueryParser " + queryParser.getClass());
{code}

{noformat}
using QueryParser class org.apache.lucene.queryparser.analyzing.AnalyzingQueryParser
search: 'qwert asdf*', query: '+f:qwert +f:asdf*', #hits: 1
search: 'qwert_asdf*', query: 'f:qwert_asdf*', #hits: 0
  ^^^ expected 1 hit(s), got 0
search: 'qwert%asdf*', query: 'f:qwert%asdf*', #hits: 0
  ^^^ expected 1 hit(s), got 0
{noformat}


> QueryParser with wildcard search does not use Analyzer's tokenizer
> ------------------------------------------------------------------
>
>                 Key: LUCENE-7437
>                 URL: https://issues.apache.org/jira/browse/LUCENE-7437
>             Project: Lucene - Core
>          Issue Type: Bug
>          Components: core/queryparser
>    Affects Versions: 6.2
>            Reporter: Michael Pichler
>            Assignee: Uwe Schindler
>         Attachments: LuceneTest.java
>
>
> Using a tokenizer that splits at underscores (e.g. SimpleAnalyzer) splits "qwert_asdfghjkl" into two words at the time of indexing.
> Searches for "qwert asdf*" or "qwert_asdfghjkl" work as expected.
> However, when a query contains wildcards, e.g. "qwert_asdf*" the query parser does not use the tokenizer of its analyzer to split the words and thus finds no result.


--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org