lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Naber <lucenel...@danielnaber.de>
Subject Re: Stemming at Query time
Date Tue, 31 May 2005 17:59:11 GMT
On Monday 30 May 2005 18:54, Andrew Boyd wrote:

>   Now that the QueryParser knows about position increments has anyone
> used this to do stemming at query time and not at indexing time?  I
> suppose one would need a reverse stemmer.  Given the query breath it
> would need to inject breathe, breathes, breathing etc.

There are two things to consider: queries will get more complicated and 
thus slower and the implementation isn't that easy:  while stemming can be 
done with a simple algorithm (for English), you'll need a dictionary with 
at least part-of-speech information for adding suffixes. That's because 
you cannot just add "ing" to any word, otherwise you'd end up with car + 
ing  = caring. (But once you have this dictionary the quality of your 
solution can be better than that of a stemmer, as stemmers also suffer 
form over-stemming, i.e. mapping two non-related words to the same form).

Regards
 Daniel

-- 
http://www.danielnaber.de

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message