lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Clemens Wyss <clemens...@mysign.ch>
Subject AW: German*Filter, Analyzer "cutting" off letters from (french) words...
Date Mon, 18 Apr 2011 07:08:49 GMT
What is the best way to "avoid" the lowercasing (and still being able to exclude stop words)?

> -----Ursprüngliche Nachricht-----
> Von: Simon Willnauer [mailto:simon.willnauer@googlemail.com]
> Gesendet: Freitag, 15. April 2011 08:56
> An: java-user@lucene.apache.org
> Betreff: Re: German*Filter, Analyzer "cutting" off letters from (french)
> words...
> 
> On Fri, Apr 15, 2011 at 8:48 AM, Clemens Wyss <clemensdev@mysign.ch>
> wrote:
> > Does the StandardAnalyzer lowercase its terms?
> yes!
> 
> simon
> >
> >> -----Ursprüngliche Nachricht-----
> >> Von: Clemens Wyss [mailto:clemensdev@mysign.ch]
> >> Gesendet: Mittwoch, 13. April 2011 13:34
> >> An: java-user@lucene.apache.org
> >> Betreff: AW: German*Filter, Analyzer "cutting" off letters from
> >> (french) words...
> >>
> >> This is what I was looking for, thanks
> >>
> >> > -----Ursprüngliche Nachricht-----
> >> > Von: Robert Muir [mailto:rcmuir@gmail.com]
> >> > Gesendet: Mittwoch, 13. April 2011 12:11
> >> > An: java-user@lucene.apache.org
> >> > Betreff: Re: German*Filter, Analyzer "cutting" off letters from
> >> > (french) words...
> >> >
> >> > If you only want to ignore german stopwords, you don't need to use
> >> > the german analyzer with german stemming. you can just use
> >> > StandardAnalyzer with your own stopwords set!
> >> >
> >> > On Wed, Apr 13, 2011 at 3:51 AM, Clemens Wyss
> >> <clemensdev@mysign.ch>
> >> > wrote:
> >> > > What I really want to do is ignore german stop words such as
> >> > > "der", "die",
> >> > "das", "ein",...
> >> > >
> >> > >> -----Ursprüngliche Nachricht-----
> >> > >> Von: Robert Muir [mailto:rcmuir@gmail.com]
> >> > >> Gesendet: Dienstag, 12. April 2011 17:03
> >> > >> An: java-user@lucene.apache.org
> >> > >> Betreff: Re: German*Filter, Analyzer "cutting" off letters from
> >> > >> (french) words...
> >> > >>
> >> > >> On Tue, Apr 12, 2011 at 8:46 AM, Clemens Wyss
> >> > <clemensdev@mysign.ch>
> >> > >> wrote:
> >> > >> > Why so? Where have the e's gone?
> >> > >> >
> >> > >>
> >> > >> the e is being stemmed as its a german suffix... all of the
> >> > >> german stemming algorithms remove final -e, as do all the french
> >> > >> stemming
> >> > algorithms.
> >> > >>
> >> > >> so i don't understand your problem.
> >> > >>
> >> > >> ----------------------------------------------------------------
> >> > >> ---
> >> > >> -- To unsubscribe, e-mail:
> >> > >> java-user-unsubscribe@lucene.apache.org
> >> > >> For additional commands, e-mail:
> >> > >> java-user-help@lucene.apache.org
> >> > >
> >> > >
> >> >
> >> > -------------------------------------------------------------------
> >> > -- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> >
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org

Mime
View raw message