lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marvin Humphrey <mar...@rectangular.com>
Subject Re: How to get the un-stemed word
Date Fri, 08 Jul 2005 21:07:12 GMT

On Jul 8, 2005, at 8:44 AM, mark harwood wrote:

> You can get the unstemmed word by re-analysing the
> (hopefully stored somewhere) text.
> Look at the tokens emitted from the TokenStream and
> when you get to the one that matches the stemmed form
> you can use the token offset info to retrieve the
> unstemmed form from the original text.

Wouldn't that fall down if you had two distinct terms which produce  
the same string when stemmed?

Best,

Marvin Humphrey
Rectangular Research
http://www.rectangular.com/


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message