lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Weir, Michael" <>
Subject Stemming problem question
Date Wed, 23 Feb 2005 20:34:18 GMT
I'm getting complaints that I assume are related to stemming, e.g.
"Stamping" (the department) being indexed as "stamp" and not found using
'stamp*' in a query.  Somewhere I read someone suggesting that text be
indexed as two fields, one with the stemmer and one without.

Rather than doing this, does it make sense to implement a
'MultiAnalyzer' class that can be associated with several Analyzers and
returns a 'MultiTokenStream' that reads tokens from each Analyzer in
turn, resetting the Reader between each?

If such a thing makes sense (and hasn't already been implemented) I
would be glad to share it.

Michael Weir 
       This message may contain privileged and/or confidential information.  If you have received
this e-mail in error or are not the intended recipient, you may not use, copy, disseminate
or distribute it; do not open any attachments, delete it immediately from your system and
notify the sender promptly by e-mail that you have done so.  Thank you. 

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message