lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Simon Willnauer <simon.willna...@googlemail.com>
Subject Re: German*Filter, Analyzer "cutting" off letters from (french) words...
Date Wed, 13 Apr 2011 09:17:23 GMT
On Wed, Apr 13, 2011 at 11:03 AM, Clemens Wyss <clemensdev@mysign.ch> wrote:
> I tried:
> Set<String> stemsToBeIgnored = new HashSet<String>(Arrays.asList( "e" ));
> GermanAnalyzer ga = new GermanAnalyzer( Version.LUCENE_31, GermanAnalyzer.getDefaultStopSet(),
stemsToBeIgnored );

try Arrays.asList("der", "die", "das", "ein");

or do I get you wrong....

simon
>
> But the e's are still "removed"...
>
>> -----Urspr√ľngliche Nachricht-----
>> Von: Simon Willnauer [mailto:simon.willnauer@googlemail.com]
>> Gesendet: Mittwoch, 13. April 2011 10:51
>> An: java-user@lucene.apache.org
>> Cc: Clemens Wyss
>> Betreff: Re: German*Filter, Analyzer "cutting" off letters from (french)
>> words...
>>
>> On Wed, Apr 13, 2011 at 9:51 AM, Clemens Wyss <clemensdev@mysign.ch>
>> wrote:
>> > What I really want to do is ignore german stop words such as "der", "die",
>> "das", "ein",...
>>
>> GermanAnalyzer takes a stemExclusionSet if you put those terms into this
>> set the stemmer will not touch them. This should be in 3.1 I think
>>
>> public GermanAnalyzer(Version matchVersion, Set<?> stopwords, Set<?>
>> stemExclusionSet)
>>
>> simon
>>
>> >
>> >> -----Urspr√ľngliche Nachricht-----
>> >> Von: Robert Muir [mailto:rcmuir@gmail.com]
>> >> Gesendet: Dienstag, 12. April 2011 17:03
>> >> An: java-user@lucene.apache.org
>> >> Betreff: Re: German*Filter, Analyzer "cutting" off letters from
>> >> (french) words...
>> >>
>> >> On Tue, Apr 12, 2011 at 8:46 AM, Clemens Wyss
>> <clemensdev@mysign.ch>
>> >> wrote:
>> >> > Why so? Where have the e's gone?
>> >> >
>> >>
>> >> the e is being stemmed as its a german suffix... all of the german
>> >> stemming algorithms remove final -e, as do all the french stemming
>> algorithms.
>> >>
>> >> so i don't understand your problem.
>> >>
>> >> ---------------------------------------------------------------------
>> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> >> For additional commands, e-mail: java-user-help@lucene.apache.org
>> >
>> >
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message