lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ahmet Arslan <>
Subject Re: Stemmer Question
Date Thu, 08 Mar 2012 16:16:03 GMT
> Thanks the KeywordMarkerFilterFactory
> seems to be what I was looking
> for.  I'm still wondering about keeping the unstemmed
> word as a token
> though.  While I know that this would increase the
> index size slightly
> I wonder what the negative of doing such a thing would
> be?  Just seems
> less destructive since I always store the unstemmed version
> and the
> stemmed version.  By not storing the unstemmed version
> there is no way
> to go back without reindexing. If I wanted to implement this
> I'm
> assuming a custom tokenizer would be most appropriate? 
> Does something
> like this already exist?

Not out-of-the-box. Actually I was using your idea, implemented such custom token filter by
mixing synonym filter and stem filter. This is useful for wildcard queries. And for normal
queries, this could rank exact matches higher. 

View raw message