lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Danilo Giacomi <danilogiac...@gmail.com>
Subject allograph matching
Date Mon, 30 Jan 2017 11:35:23 GMT
Hi all,
we have some latin documents indexed in SOLR  which we'd like to search
through using allographs as well.

This means that, as there have been different ways to write the same
letters over the time, we can now have the same word written using
different letters.

This specifically resolves in our texts having the letter "u" and "v" used
both as the same letter (we can find "tempus" as well as "tempvs" in our
documents) and a similar thing with "y", "j" and "i" (so we can have
"fugit" or "fugjt" or "fvgit" or "fugjt" ecc.).

As we cannot provide a full list of synonyms (that would consist in the
entire latin dictionary...) is there a sensible way to realize what we're
after?

Thanks in advance,
Danilo

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message