lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Cesar Ronchese <>
Subject Re: Indexing accented characters, then searching by any form
Date Mon, 11 Feb 2008 16:11:41 GMT

Hey Karl. Thanks for the response. I have some doubts more:

1) About the ISOLatin1AccentFilter class:
> What is the problem you have with this? Are they not unique enough?

I need to store the words in the way it was written. So, if the text to be
indexed contains the word "usuário", my user expects, in his search results,
see the word in its correct form, "usuário", not "usuario".

I could be wrong, but that is what I think ISOLatin1AccentFilter does:
performs the content indexing by removing the accents, then every later
results will come without accents also. In short, I lose the accents

2) About the synonym and permutation:
> You would have to add a synonym for each permutation of the accented term.

Do your have a code sample for this? Or even a link reference on how-to?
View this message in context:
Sent from the Lucene - Java Users mailing list archive at

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message