lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marvin Humphrey <>
Subject Re: How to get the un-stemed word
Date Fri, 08 Jul 2005 21:07:12 GMT

On Jul 8, 2005, at 8:44 AM, mark harwood wrote:

> You can get the unstemmed word by re-analysing the
> (hopefully stored somewhere) text.
> Look at the tokens emitted from the TokenStream and
> when you get to the one that matches the stemmed form
> you can use the token offset info to retrieve the
> unstemmed form from the original text.

Wouldn't that fall down if you had two distinct terms which produce  
the same string when stemmed?


Marvin Humphrey
Rectangular Research

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message