From: "Luke Kenneth Casson Leighton" <lkcl@samba-tng.org>
Sent: Tuesday, June 12, 2001 10:22 AM
> for various reasons i am prompted to ask,
>
> how would the idea of having an apr_ucs16 set of routines,
> apr_wstrcat, apr_wstrcpy, apr_wtolower, apr_wtoupper etc.,
> be received?
Well, since apr_isfoo apr_tofoo was 'reinvented', I don't see a
huge problem.
> on nt, it's easy: straightforward usage of the NT
> wstrcat, wstrcpy etc. lines.
These are the folks who never read the "Security Implications" of ucs-8
leaving 40% of all IIS webservers still vulnerable, so I'm dubious :-)
> on unix, it's slightly more tricky, but easily doable.
> [and example code exists in samba, anyway:
> they've tried it there, but never yet completed it
> satisfactorily]
>
> iirc, glib has a unicode library, however it is ucs32 not
> ucs16, and depends on glib, which is an N-mbytes install,
> and not what i need, iow.
>
> how about it? :)
Well, how about a simple question. Why restrain ourselves to ucs2?
(No such thing as ucs16/32, it's ucs2/4).
Can iconv/apr_iconv provide this in a charset-opaque manner? That is, if
I want three 'characters' in shift-jis, can it give me the right number
of bytes? The reason is simple, Unicode is already splintered into a
multi-word character set anyways. I suspect it's easier to just get it
right, knowing the apr_xlate that's been opened, and asking for the char
len v.s. the byte len (sizeof) and providing the strcpy/cmp, etc.
Bill
|