directory-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ole Ersoy <ole_er...@yahoo.com>
Subject Re: String preparations
Date Tue, 26 Dec 2006 16:44:01 GMT
I would wait for Java six.  See you at the unicoke
conference :-)

Happy Holidays!

- Ole


--- Emmanuel Lecharny <elecharny@gmail.com> wrote:

> Hi guys !
> 
> I'm currently working on the implementation of RFC
> 4518, which says that 
> to be able to apply MatchingRules on String values,
> we should transfomr 
> them
> ('Prepare').
> 
> This transformation is a 6 steps process, pretty
> boring, and somwhere in 
> the middle, there is a Normalization steps, where
> characters may be 
> transformed to multi-characters like : "Schön" will
> be transformed to 
> "Scho\u0308n" (the ö is transformed to a simple 'o'
> plus a code) (not 
> that this is *not* a good exemple, because the
> transformation we must 
> implement is different. It's NFKC transformation
> (for those who have 
> _nothing_ else to do, or who had an argument with
> boyfriend/girlfriend 
> and has a lot of time to waste, waiting he/she cools
> down, here is the 
> doco : 
>
http://www.unicode.org/unicode/reports/tr15/tr15-22.html#Specification)
> 
> Ok, now, the point is : in Java 5, there is nothing
> in the API to do 
> this normalizer (Java 6 has it !), but as we won't
> switch to java 6, it 
> lefts us with few options :
> 1) why the hell do we need to take care of those
> bloody countries with 
> bloody letters - hieroglyph, or whatever I can't
> read - that exceed the 
> Beauty of US-ASCII ???
> 2) damn, I'm french/german/turk/... (ISO-3166, pick
> your country) and my 
> name does not make it with US-ASCII (like Szörner,
> or Lécharny :). I 
> have to do some normalization...
> 2-a) Let's wait for Java 6... We are not in a hurry,
> the current code 
> covers 99,9999999% of all the cases.
> 2-b) Let's use apache-abdera Unicode impl, it seems
> pretty complete
> 2-c) I feel like implementing this Normalizer
> myself, because I LOVE 
> Unicode ! (I know all of  the 1 156 345 characters,
> and I can draw them 
> knowing only their values... Actually, I also do
> crack, and I am a 
> speaker at each Unicoke conference ...)
> 
> Ok, ok, I think that 2-b make the trick, from my
> point of view. wdyt ?
> 
> Emmanuel L\u00e9charny
> 
> Oh, great idea if you forgot to send a gift to your
> mother-in-law, the 
> last Unicode spec version, only 1450 pages !  :
>
http://search.barnesandnoble.com/booksearch/isbninquiry.asp?ean=9780321480910&displayonly=TOC&z=y#TOC
> 


__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 

Mime
View raw message