lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Neil Couture" <ncout...@convera.com>
Subject RE: language identifier contrib
Date Mon, 13 Jan 2003 22:27:38 GMT
I think that the main point is to not lose information and with stemmmer
you do so because if you look at a stemmers as a mathematical function its
obvious that theses are surjective but not bijective function. 
If you loose information  then this will have impact on your precision and recall.


Needless to say that stemers can be not so bad on language like
English which do not have a very complex morphology. But this is
not the case for say french. 


-neil

-----Original Message-----
From: Leo Galambos [mailto:galambos@com-os2.ms.mff.cuni.cz]
Sent: 13 janvier, 2003 17:08
To: Lucene Developers List
Subject: RE: language identifier contrib


You do not like stemmers and so Porter's work is not of very good quality.  
Pardon? What did you try to say?

Do you use text windows in your ``stemmer/lemmatizer'' like Xu&Croft? If
not, where is any improvement for IR system? I mean, for precission and
recall? Have you written any papers about it? Could you send them to me or
describe the main points here? Thank you very much.

-g-

On Mon, 13 Jan 2003, Neil Couture wrote:

> because I do not like stemmer. I prefer lemmatizer. 
> 
> -neil
> 
> -----Original Message-----
> From: Leo Galambos [mailto:galambos@com-os2.ms.mff.cuni.cz]
> Sent: 13 janvier, 2003 09:45
> To: Lucene Developers List
> Subject: RE: language identifier contrib
> 
> 
> On Mon, 13 Jan 2003, Neil Couture wrote:
> 
> > Snowball stemmer is not of very good quality.
> 
> Why do you think it?
> 
> -g-
> 
> 
> --
> To unsubscribe, e-mail:   <mailto:lucene-dev-unsubscribe@jakarta.apache.org>
> For additional commands, e-mail: <mailto:lucene-dev-help@jakarta.apache.org>
> 
> 
> --
> To unsubscribe, e-mail:   <mailto:lucene-dev-unsubscribe@jakarta.apache.org>
> For additional commands, e-mail: <mailto:lucene-dev-help@jakarta.apache.org>
> 



--
To unsubscribe, e-mail:   <mailto:lucene-dev-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-dev-help@jakarta.apache.org>


--
To unsubscribe, e-mail:   <mailto:lucene-dev-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-dev-help@jakarta.apache.org>


Mime
View raw message