lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jack Krupansky (JIRA)" <j...@apache.org>
Subject [jira] [Created] (LUCENE-4382) Unicode escape no longer works for non-prefix wildcard terms
Date Thu, 13 Sep 2012 00:00:07 GMT
Jack Krupansky created LUCENE-4382:
--------------------------------------

             Summary: Unicode escape no longer works for non-prefix wildcard terms
                 Key: LUCENE-4382
                 URL: https://issues.apache.org/jira/browse/LUCENE-4382
             Project: Lucene - Core
          Issue Type: Bug
          Components: core/queryparser
    Affects Versions: 4.0-BETA
            Reporter: Jack Krupansky
             Fix For: 4.0


LUCENE-588 added support for escaping of wildcard characters, but when the de-escaping logic
was pushed down from the query parser (QueryParserBase) into WildcardQuery, support for Unicode
escaping (backslash, "u", and the four-digit hex Unicode code) was not included.

Two solutions:

1. Do the Unicode de-escaping in the query parser before calling getWildcardQuery.
2. Support Unicode de-escaping in WildcardQuery.

A suffix wildcard does not exhibit this problem because full de-escaping is performed in the
query parser before calling getPrefixQuery.

My test case, added at the beginning of TestExtendedDismaxParser.testFocusQueryParser:

{code}

    assertQ("expected doc is missing (using escaped edismax w/field)",
        req("q", "t_special:literal\\:\\u0063olo*n", 
            "defType", "edismax"),
        "//doc[1]/str[@name='id'][.='46']"); 

{code}


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message