lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From hg...@cswebmail.com
Subject more rigid stopword list ?
Date Thu, 22 Apr 2004 16:55:09 GMT
Dear all,

for my taste the stopwords included in Lucene (e.g.
StopAnalyzer.ENGLISH_STOP_WORDS, wich is usually used
with the SnowballAnalyzer - and I guess also with the
StandardAnalyzer) is not strict enough:

For example in a sentence with "we need ..." I would
consider "we" and "need" as stopwords but they are not
stripped by SnowballAnalyzer or StandardAnalyzer. 

Now:
Is there an in-built solution to use more restrictive
stripping or do I better create my own analyzer in that
case with a more restrictive stopword list ?

If so - are you aware of more rigid lists ? (a URI
would be great !)

Thanks,

Holger

___________________________________________________
The ALL NEW CS2000 from CompuServe
 Better!  Faster! More Powerful!
 250 FREE hours! Sign-on Now!
 http://www.compuserve.com/trycsrv/cs2000/webmail/





---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org


Mime
View raw message