lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <luc...@mikemccandless.com>
Subject German decompounding/tokenization with Lucene?
Date Fri, 15 Sep 2017 22:57:51 GMT
Hello,

I need to index documents with German text in Lucene, and I'm wondering how
people have done this in the past?

Lucene already has a DictionaryCompoundWordTokenFilter ... is this what
people use?  Are there good, open-source friendly German dictionaries
available?

Thanks,

Mike McCandless

http://blog.mikemccandless.com

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message