lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Twan Kogels <>
Subject Dutch Analyzer dictionary format?
Date Fri, 26 Nov 2004 09:42:04 GMT
Hello all,

I'm using lucene to search through a couple of documents to find 
interesting documents. Most documents are in Dutch language. I saw that the 
default snowball stemmer wasn't doing well on text written in a foreign 
language. Lucky i found a Dutch text analyzer in de lucene sandbox project.

I've read the javadoc and found out it needs a stemdictionary. You can load 
this dictionary with the following function:
DutchAnalyzer.setStemDictionary(File f)

The format needs to be a tab separator list (word [tab] stem).

To be sure i do everything correctly i've got a question about the dictonary:
Can i just get:
and convert it to a tab separated list and then "feed" it to the 
setStemDictionary() function?

Kind regards,
Twan Kogels 

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message