lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Green <ndrw_...@yahoo.com.mx>
Subject Re: Straigtforward stemming example? Dictionary needed?
Date Thu, 26 Apr 2007 17:41:41 GMT
El jue, 26-04-2007 a las 09:29 +1000, Daniel Noll escribió:
> damien.mccarthy@propylon.com wrote:
> > I guess there are a few points
> > 
> > - it is impossible to stem with total accuracy using rules alone
> > 
> > - combining a rule based stemmer with a dictionary could also be error
> > prone. Unrelated words can have the same stem - consider the past tense of
> > see and the stem of sawing ( cutting wood )
> 
> I guess you could solve some of these by stemming (actually, lemmatising 
> is probably more accurate, since you'll almost certainly need part of 
> speech information to do it with any accuracy) to the wordnet number, 
> for which the meaning is unique.
> 
> But there are still ambiguous sentences where you don't know which word 
> it is.  To use the example just given, "I saw wood" is pretty ambiguous, 
> without having more context.
> 
> Daniel
> 
> 

Great... thanks, everyone, for your replies. For now we've decided to
stick to the standard Snowball rule-based stemmer, but we'll be keeping
our eyes open for other options that may come up.

- Andrew Green


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message