lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gerhard Schwarz <gerhard.schw...@fpg.de>
Subject Re: Prefix query case sensitive?
Date Mon, 12 Nov 2001 16:50:04 GMT
Doug Cutting wrote:
> 
> Yes, prefix queries are indeed case sensitive.  The problem is that prefixes
> do not go through an analyzer, which is where lower-casing is done.  The
> reason is that if you were searching for "dogs*" you would not want "dogs"
> first stemmed to "dog", since that would then match "dog*", which is not the
> intended query.  A workaround for this is simply to lowercase the entire
> query before passing it into the query parser.

(Sorry for the late comment, I got a job to do)

Thats no good, at least with my german stemmer. Lowercase only
prefix/wildcard terms.
The problem is that nouns are stemmed differently, and if the whole
query becomes lowercased, the non-prefix/wildcard terms will be
stemmed differently.
I don't want to switch to medium stemming, because soft stemming
with seperated noun stemming and lowercase indexing produces best
overall results for german language (uniqueness of discriminators,
rate of noise and processing speed).


Btw., Doug, could I get access to the CVS to commit changes/
improvements to the german stemming classes?


Greets,
Gerhard

--
To unsubscribe, e-mail:   <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>


Mime
View raw message