lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Muir <>
Subject Re: Where to find non-English dictionaries, thesaurus, synonyms
Date Fri, 07 Jan 2011 21:26:37 GMT
On Thu, Jan 6, 2011 at 11:53 AM, Pulkit Singhal <> wrote:
> Hello,
> What's a good source to get dictionaries (for spellcorrections) and/or
> thesaurus (for synonyms) that can be used with Lucene for non-English
> languages such as Fresh, Chinese, Korean etc?

if you can't find a wordlist of correctly-spelled words somewhere
else, you can always try, grab the
openoffice spellchecker dictionary for that language, and use the
hunspell "unmunch" command (sort of like morphological generation) to
generate a list of words you could then use with PlainTextDictionary.

> For example, the wordnet contrib module is based on the data set
> provided by the Princeton based wordnet system but I'm wondering where
> the Lucene users go for similar reliable source for other languages?

in this case i would also investigate the openoffice thesaurus data,
if you cant find anything else.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message