lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Weir, Michael" <Michael.W...@cognos.com>
Subject Stemming problem question
Date Wed, 23 Feb 2005 20:34:18 GMT
I'm getting complaints that I assume are related to stemming, e.g.
"Stamping" (the department) being indexed as "stamp" and not found using
'stamp*' in a query.  Somewhere I read someone suggesting that text be
indexed as two fields, one with the stemmer and one without.

Rather than doing this, does it make sense to implement a
'MultiAnalyzer' class that can be associated with several Analyzers and
returns a 'MultiTokenStream' that reads tokens from each Analyzer in
turn, resetting the Reader between each?

If such a thing makes sense (and hasn't already been implemented) I
would be glad to share it.

Thanks,
Michael Weir 
  
       This message may contain privileged and/or confidential information.  If you have received
this e-mail in error or are not the intended recipient, you may not use, copy, disseminate
or distribute it; do not open any attachments, delete it immediately from your system and
notify the sender promptly by e-mail that you have done so.  Thank you. 
 

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message