lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Miller <markrmil...@gmail.com>
Subject Re: [jira] Updated: (LUCENE-1029) Illegal character replacements in ISOLatin1AccentFilter
Date Tue, 16 Oct 2007 17:13:32 GMT
I feel like a fool continuing this debate, being the least intelligent 
guy in the room, but here goes:

My point was that wikipedia (the link i gave and other definitions I 
saw) seem to refer to the little markings around a letter as 
diacriticals whether they mean the letter is a completely different 
letter or not (see the part mentioning Scandinavian, as well as possibly 
Websters dictionary). Marko disputed this in his last comment, and I 
don't know that he is wrong. All I have seen seems to indicate this though.

I also dispute this sentence in the new javadoc patch proposed:

*It will also be impossible to search for the word in its original form.*

If you use the same analyzer at search and query time, there should be no such problem.


Doug Cutting wrote:
> Mark Miller wrote:
>> I wouldn't pretend to know the truth on this matter, but you might 
>> update the wikipedia article http://en.wikipedia.org/wiki/Diacritic 
>> if you do, as it does not agree with your comments.
>
> Wikipedia says, "Swedish uses characters identical to a-diaeresis (ä) 
> and o-diaeresis (ö)".  This is a little ambiguous.  Identical how?  I 
> think they mean "visually identical to".  The distinction is whether 
> Swedish treats 'ä' as a variant of 'a' or as a completely separate 
> letter.  The latter is the case.
>
> http://en.wikipedia.org/wiki/Umlaut_(diacritic) states:
>
>   Swedish [...] treat[s] them as independent letters.
>
> Doug
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message