lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erick Erickson" <>
Subject Re: stemmer
Date Sat, 18 Nov 2006 15:49:27 GMT

There are some rather extensive threads on this list about the "interesting"
issues that exist when indexing/searching other languages. I think you'd
find it worthwhile to search the list archive for foreign language or some

The short answer as I remember is that there *is* a built-in stemmer, but
whether it does what you want when indexing multiple languages depends upon
what results you expect to get...and there's no clear answer that I


On 11/18/06, Thomas Klein <> wrote:
> Hi there,
> I'm fairly new to lucene, I just developped a multi threaded indexing
> tcp server using lucene to hmmm, let me remember, index stuffs :)
> I have to index not only english, but french and german, and, I don't
> know, perhaps other languages in the future.
> Did lucene use a default stemmer or do I have to stem the texts before
> indexing ?
> Does a multi-language stemmer exists ?
> (sorry if the answers are in the documentation, I didn't manage to
> fully read it)
> Thanks in advance !
> Thomas Klein.
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message