lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pierrick Brihaye <pierrick.brih...@culture.gouv.fr>
Subject Re: (Offtopic) The unicode name for a character
Date Wed, 22 Dec 2004 11:31:56 GMT
Hi,

Morus Walter a écrit :

> If you cannot find that list somewhere I can mail you a copy.

ICU4J's one is here :

http://oss.software.ibm.com/cvs/icu4j/icu4j/src/com/ibm/icu/dev/data/unicode/UnicodeData.txt?rev=1.7&content-type=text/x-cvsweb-markup

See also Unicode's one:
http://www.unicode.org/Public/UNIDATA/UnicodeData.txt

http://pistos.pe.kr/javadocs/etc/icu4j2_4/doc/com/ibm/icu/lang/UCharacter.html#getName(int)

should also help you.

However, I don't think that the names are consistent enough to permit a 
generic use of regular expressions. What Daniel is trying to achieve 
looks interesting anyway,

Good luck,

-- 
Pierrick Brihaye, informaticien
Service régional de l'Inventaire
DRAC Bretagne
mailto:pierrick.brihaye@culture.gouv.fr
+33 (0)2 99 29 67 78

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message