lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David_Birthw...@VWR.COM
Subject Re: Lowercasing wildcards - why?
Date Fri, 30 May 2003 13:48:42 GMT

Hi Les,

We ended up modifying the QueryParser to pass prefix and suffix queries
through the Analyzer.  For us, it was about stemming.  If you decide to use
an analyzer that incorporated stemming, there are cases where wildcard
queries will not return the expected results.

Example:  "searcher" will probably get stemmed to "search".  A search on
"searche*" should hit the term "searcher", but, it won't, all instances of
"searcher" having been stemmed to "search" at index time.  Our solution was
to remove the trailing wildcard and send "searche" to the analyzer, then
tack the wildcard character back on there and create the PrefixQuery object
with the new search string "search*".


                      Leslie Hughes                                                      
                      <Leslie.Hughes@watercorporat        To:       "''"
            >                          <>
                      05/30/03 01:09 AM                   Subject:  Lowercasing wildcards
- why?                 
                      Please respond to "Lucene                                          
                      Users List"                                                        


I was just wondering what the rationale is behind lowercasing wildcard
queries produced by QueryParser? It's just that my data is all upper case
and my analyser doesn't lowercase so it seems a bit odd that I have to call
setLowercaseWildcardTerms(false). Couldn't queryparser leave the terms
unnormalised or better still pass them through the analyser?

I'm sure there's a good reason for it though.....


To unsubscribe, e-mail:
For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message