lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David_Birthw...@VWR.COM
Subject Re: Lowercasing wildcards - why?
Date Fri, 30 May 2003 15:24:47 GMT

Your analyzers can optionally incorporate stemming, along with the other
things that analyzers do (lowercasing, etc...).  The stemming algorithms
are all different.  This "searcher" example was made up, but, there are
instances where stemming at index time and not stemming wildcard searches
will result in lost hits.  Specifically, we encountered this situation
using the optional Snoball analyzers (which work great, by the way).

DaveB




                                                                                         
            
                      Leo Galambos                                                       
            
                      <Leo.G@seznam.cz>        To:       Lucene Users List         
                  
                                                <lucene-user@jakarta.apache.org>   
                  
                      05/30/03 10:26 AM        cc:                                       
            
                      Please respond to        Subject:  Re: Lowercasing wildcards - why?
            
                      "Lucene Users                                                      
            
                      List"                                                              
            
                                                                                         
            
                                                                                         
            




I'm sorry, I did not read the complete thread. Do you mean - analyzer ==
stemmer? Does it really work? If I was a stemmer, I would let "searche"
intact. ;-)

-g-

David_Birthwell@VWR.COM wrote:

>Hi Les,
>
>We ended up modifying the QueryParser to pass prefix and suffix queries
>through the Analyzer.  For us, it was about stemming.  If you decide to
use
>an analyzer that incorporated stemming, there are cases where wildcard
>queries will not return the expected results.
>
>Example:  "searcher" will probably get stemmed to "search".  A search on
>"searche*" should hit the term "searcher", but, it won't, all instances of
>"searcher" having been stemmed to "search" at index time.  Our solution
was
>to remove the trailing wildcard and send "searche" to the analyzer, then
>tack the wildcard character back on there and create the PrefixQuery
object
>with the new search string "search*".
>
>DaveB
>
>
>
>
>

>                      Leslie Hughes

>                      <Leslie.Hughes@watercorporat        To:
"'lucene-user@jakarta.apache.org'"
>                      ion.com.au>
<lucene-user@jakarta.apache.org>
>                                                          cc:

>                      05/30/03 01:09 AM                   Subject:
Lowercasing wildcards - why?
>                      Please respond to "Lucene

>                      Users List"

>

>

>
>
>
>
>Hi,
>
>I was just wondering what the rationale is behind lowercasing wildcard
>queries produced by QueryParser? It's just that my data is all upper case
>and my analyser doesn't lowercase so it seems a bit odd that I have to
call
>setLowercaseWildcardTerms(false). Couldn't queryparser leave the terms
>unnormalised or better still pass them through the analyser?
>
>I'm sure there's a good reason for it though.....
>
>
>Les
>
>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
>For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>
>
>
>
>
>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
>For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>
>
>
>




---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org







---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message