lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erik Hatcher <e...@ehatchersolutions.com>
Subject Re: more rigid stopword list ?
Date Thu, 22 Apr 2004 17:46:11 GMT
p.s. there is no need to create a new Analyzer to tweak the stop word  
list.  The analyzers that do stop word removal accept the list as an  
argument to an overloaded constructor.

	Erik


On Apr 22, 2004, at 1:08 PM, Otis Gospodnetic wrote:

> Moving to lucene-user list.
>
> One of my Lucene articles includes a more comprehensive stop word list
> for English:
>
> http://www.onjava.com/pub/a/onjava/2003/01/15/lucene.html? 
> page=2#references
>
> Otis
>
> --- hgadm@cswebmail.com wrote:
>> Dear all,
>>
>> for my taste the stopwords included in Lucene (e.g.
>> StopAnalyzer.ENGLISH_STOP_WORDS, wich is usually used
>> with the SnowballAnalyzer - and I guess also with the
>> StandardAnalyzer) is not strict enough:
>>
>> For example in a sentence with "we need ..." I would
>> consider "we" and "need" as stopwords but they are not
>> stripped by SnowballAnalyzer or StandardAnalyzer.
>>
>> Now:
>> Is there an in-built solution to use more restrictive
>> stripping or do I better create my own analyzer in that
>> case with a more restrictive stopword list ?
>>
>> If so - are you aware of more rigid lists ? (a URI
>> would be great !)
>>
>> Thanks,
>>
>> Holger
>>
>> ___________________________________________________
>> The ALL NEW CS2000 from CompuServe
>>  Better!  Faster! More Powerful!
>>  250 FREE hours! Sign-on Now!
>>  http://www.compuserve.com/trycsrv/cs2000/webmail/
>>
>>
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
>> For additional commands, e-mail: lucene-dev-help@jakarta.apache.org
>>
> ka
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message