lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Otis Gospodnetic <otis_gospodne...@yahoo.com>
Subject Re: Dutch Analyzer dictionary format?
Date Fri, 26 Nov 2004 17:02:13 GMT
Judging from everything you've said, the answer is yes.  I don't use
Dutch Analyzer, so I'm not 100% sure about this, but it sounds easy
enough to try.

Otis

--- Twan Kogels <twan@twansoft.com> wrote:

> Hello all,
> 
> I'm using lucene to search through a couple of documents to find 
> interesting documents. Most documents are in Dutch language. I saw
> that the 
> default snowball stemmer wasn't doing well on text written in a
> foreign 
> language. Lucky i found a Dutch text analyzer in de lucene sandbox
> project.
> 
> I've read the javadoc and found out it needs a stemdictionary. You
> can load 
> this dictionary with the following function:
> DutchAnalyzer.setStemDictionary(File f)
> 
> The format needs to be a tab separator list (word [tab] stem).
> 
> To be sure i do everything correctly i've got a question about the
> dictonary:
> Can i just get:
> <http://snowball.tartarus.org/dutch/diffs.txt>
> and convert it to a tab separated list and then "feed" it to the 
> setStemDictionary() function?
> 
> Kind regards,
> Twan Kogels 
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message