lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Wang <>
Subject i18n query normalization
Date Tue, 23 Aug 2005 17:15:17 GMT

   We have a multi-languaged index and we need to match accented
characters with non accented characters. For example, if a document
contains: mângão, the query: mangao should match it.

    I guess I would have to build some sort of analyzer/tokenizer for this.

    I was wondering if there are tokenizers already built for lucene.



To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message