lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Leo Galambos <>
Subject Re: Where to get stopword lists?
Date Fri, 06 Jun 2003 17:11:20 GMT
Ulrich Mayring wrote:

> Hello,
> does anyone know of good stopword lists for use with Lucene? I'm 
> interested in English and German lists.

What does mean ``good''? It depends on your corpus IMHO. The best way, 
how one can get a ``good'' stop-list, is an analysis that's based on 
idf. Thus, index your documents, list all the terms with low idf out, 
save them in a file and use them in next indexing round.

Just a thought...


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message