lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From René Ferréro <rene_ferr...@yahoo.fr>
Subject Re: FW: Full French Analyser ?
Date Fri, 21 Mar 2003 20:46:18 GMT
Sorry, I was interrupted in my last mail post.
I'm going on below.

 --- René Ferréro <rene_ferrero@yahoo.fr> a écrit : > 
--- Pierre Lacchini <pl@peopleware.lu> a écrit : >
> Thx René that's very Helpfull !!!
> > 
> > But I got an error in the code :
> > 
> > String s = stemmer.stem(token.termText());
> > 
> > The stem method uses a boolean argument, and not a
> > string...
> 
The code I wrote down in my last email as I
indicated  is just a highlight of what to do. "How to
make it work" needs somehow a few more lines of code.
There are many ways to do it :

1- Add a stem(String) method and delegate to it the
calling of snowball frenchStemmer as :

 public String stem(String term)
 {
 	stemmer.setCurrent(term);
 	if (stemmer.stem())
        	return stemmer.getCurrent();
 	else
 		return term;
 }

2- Add to SnowballProgram ancestor of frenchStemmer
class stem(String s).
This is the fastest way but it would make your code 
dependent to future snowball releases. But I don't
know your requirements.

3- If you have some large range of different stemmers
to be integrated, you can usefully consider writing
down an Adaptor class to receive calls from analyser
and forward them to a particular snowball stemmers. 


> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> > 
> > -----Original Message-----
> > From: René Ferréro [mailto:rene_ferrero@yahoo.fr]
> > Sent: mercredi 19 mars 2003 22:20
> > To: Lucene Users List; pl@peopleware.lu
> > Subject: RE: Full French Analyser ?
> > 
> > 
> > Hi Pierre,
> > 
> > I did the same thing some time ago. Here are the
> > highlights :
> > 
> > 1- Create a FrenchStemFilter class that extends
> > TokenFilter
> > 
> > import net.sf.snowball.ext.frenchStemmer;
> > 
> > /**
> >  * Constructor for SnowballFrenchStemFilter.
> >  */
> > 
> > public FrenchStemFilter(TokenStream in)
> > {
> > 	stemmer = new frenchStemmer();
> > 	input = in;
> > }
> > 
> > public final Token next() throws IOException
> > {
> > 	Token token = input.next();
> > 	if (token == null)
> > 		return null;
> > 	else
> > 	{
> > 		String s = stemmer.stem(token.termText());
> > 		// If not stemmed, dont waste the time creating
> a
> > new token
> > 		if (!s.equals(token.termText()))
> > 			return new Token(s, token.startOffset(),
> > token.endOffset(), token.type());
> > 	}
> > 	return token;
> > }
> > 
> > 2- Finally create a FrenchAnalyzer that returns a
> > TokenStream whose tokens are filtered by the
> > previous
> > stemmer.
> > 
> > Hope that it can help you.
> > 
> >  --- Pierre Lacchini <pl@peopleware.lu> a écrit :
> >
> > Ok
> > thx !!! That is exactly what i was looking for...
> > >
> > > But how can i use it ?
> > > (sorry i'm kinda noob in Java)...
> > >
> > > The snowball.JAR has been added to my project,
> but
> > > now i dunno how to use
> > > it...
> > >
> > > -----Original Message-----
> > > From: Alex Murzaku [mailto:lists@lissus.com]
> > > Sent: mercredi 19 mars 2003 15:49
> > > To: 'Lucene Users List'; pl@peopleware.lu
> > > Subject: RE: Full French Analyser ?
> > >
> > >
> > > You can find Danish, Dutch, English, Finnish,
> > > French, German, Italian,
> > > Norwegian, Portuguese, Russian, Spanish and
> > Swedish
> > > Snowball
> > > stemmers/analyzers at:
> > >
> >
>
http://jakarta.apache.org/lucene/docs/lucene-sandbox/snowball/
> > >
> > > Doug or Otis, why don't you move these out of
> the
> > > sandbox and make them
> > > integral part of Lucene?
> > >
> > > --
> > > Alex Murzaku
> > > ___________________________________________
> > >  alex(at)lissus.com  http://www.lissus.com
> > >
> > > -----Original Message-----
> > > From: Pierre Lacchini [mailto:pl@peopleware.lu]
> > > Sent: Wednesday, March 19, 2003 10:02 AM
> > > To: Lucene (E-mail)
> > > Subject: Full French Analyser ?
> > >
> > >
> > > Heya all,
> > >
> > > I'm looking for a full French Analyser,
> containing
> > a
> > > FrenchPorterStemmer... Does anyone know where i
> > can
> > > find one ?
> > >
> > > And if I wanna create my own FrenchAnalyser  - I
> > > have the STOP_WORDS
> > > list - can I remove the standard
> PorterStemFilter
> > ?
> > >
> > > In fact, can I crete a new Analyser without
> > > PorterStemmer at all ?
> > >
> > > Thx ;)
> > >
> > >
> > >
> > >
> > >
> >
>
---------------------------------------------------------------------
> > > To unsubscribe, e-mail:
> > > lucene-user-unsubscribe@jakarta.apache.org
> > > For additional commands, e-mail:
> > > lucene-user-help@jakarta.apache.org
> > >
> > >
> > >
> >
>
---------------------------------------------------------------------
> > > To unsubscribe, e-mail:
> > > lucene-user-unsubscribe@jakarta.apache.org
> > > For additional commands, e-mail:
> > > lucene-user-help@jakarta.apache.org
> > >
> > 
> >
>
___________________________________________________________
> > Do You Yahoo!? -- Une adresse @yahoo.fr gratuite
> et
> > en français !
> > Yahoo! Mail : http://fr.mail.yahoo.com
> > 
> > 
> >
>
---------------------------------------------------------------------
> > To unsubscribe, e-mail:
> > lucene-user-unsubscribe@jakarta.apache.org
> > For additional commands, e-mail:
> > lucene-user-help@jakarta.apache.org
> >  
> 
>
___________________________________________________________
> Do You Yahoo!? -- Une adresse @yahoo.fr gratuite et
> en français !
> Yahoo! Mail : http://fr.mail.yahoo.com
> 
>
---------------------------------------------------------------------
> To unsubscribe, e-mail:
> lucene-user-unsubscribe@jakarta.apache.org
> 
=== message truncated === 

___________________________________________________________
Do You Yahoo!? -- Une adresse @yahoo.fr gratuite et en français !
Yahoo! Mail : http://fr.mail.yahoo.com

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message