jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ard Schrijvers (JIRA)" <j...@apache.org>
Subject [jira] [Created] (JCR-3511) JackrabbitQueryParser incorrectly handles terms with wildcards when using analyzers that do more than lowercasing
Date Tue, 05 Feb 2013 11:37:13 GMT
Ard Schrijvers created JCR-3511:
-----------------------------------

             Summary: JackrabbitQueryParser incorrectly handles terms with wildcards when
using analyzers that do more than lowercasing 
                 Key: JCR-3511
                 URL: https://issues.apache.org/jira/browse/JCR-3511
             Project: Jackrabbit Content Repository
          Issue Type: Bug
            Reporter: Ard Schrijvers
            Assignee: Ard Schrijvers
             Fix For: 2.2.14, 2.4.4


wildcard pre/postfixing combined with stemming is not always possible to work correctly in
Lucene. However, postfixing a term with a wildcard should play nicely with the configured
analyzers. Assume you have an analyzer that contains Lucene ISOLatin1AccentFilter. In that
case, there is currently the problem that when for example indexing the word 'très' (mind
the è accent) and then quering 

//*[jcr:contains(.',trè*')] does not have a hit for très. 

//*[jcr:contains(.',très')] DOES and
//*[jcr:contains(.',tr*')] DOES but
//*[jcr:contains(.',trè*')] DOES NOT

Problem is simple to solve as in JackrabbitQueryParser#getWildcardQuery gets the non-analyzed
termStr as argument where afaics it should get the analyzed version. Then, also  getLowercaseExpandedTerms()
in #getWildcardQuery is redundant



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message