lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mariella Di Giacomo <marie...@lanl.gov>
Subject Question about Analyzer and words spelled in different languages
Date Wed, 05 Jan 2005 18:51:58 GMT
Hi ALL,


We are trying to index scientic articles written in english, but whose 
authors can be spelled in any language (depending on the author's nazionality)

E.g.
Schäffer


In the XML document that we provide to Lucene the author name is written in 
the following way (using HTML ENTITIES)

Sch&amp;auml;ffer

So in practice that is the name that would be given to a Lucene analyzer/filter

Is there any already written analyzer that would take that name 
(Sch&amp;auml;ffer or any other name that has entities) so that
Lucene index could searched (once the field has been indexed) for the real 
version of the name, which is

Schäffer

and the english spelled version of the name which is

Schaffer

Thanks a lot in advance for your help,


Mariella

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message