lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karl Wettin <karl.wet...@gmail.com>
Subject Re: Lemmatization
Date Wed, 08 Jun 2011 21:08:37 GMT
Perhaps "least frequent substring" or even "suffix truncation" might be enough for your needs.

Here is a related paper: http://web.jhu.edu/bin/q/b/p75-mcnamee.pdf


	karl



On Jun 8, 2011, at 1:52 PM, Mohamed Yahya wrote:

> You're right. Still, I am not sure if there is a library that would
> take care of examples such as the one I gave.
> 
> On Wed, Jun 8, 2011 at 11:25, Lahiru Samarakoon <lahiruts@gmail.com> wrote:
>> Hi,
>> 
>>> 
>>> Is there something in Lucene that supports lemmatization of the following
>>> form:
>>> 
>>> Mexican --> Mexico (from adjective to name/noune)
>>> 
>>> Lemmatization do not change part of speech. I think you are looking for a
>> stemming algorithm.
>> 
>> http://nlp.stanford.edu/IR-book/html/htmledition/stemming-and-lemmatization-1.html
>> 
>>> 
>>> 
>>> Thanks,
>> Lahiru
>> 
> 
> 
> 
> -- 
> Mohamed  Yahya
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message