apr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "William A. Rowe, Jr." <ad...@rowe-clan.net>
Subject Re: apr unicode-16 lib.
Date Tue, 12 Jun 2001 16:46:30 GMT
From: "Luke Kenneth Casson Leighton" <lkcl@samba-tng.org>
Sent: Tuesday, June 12, 2001 10:22 AM


> for various reasons i am prompted to ask,
> 
> how would the idea of having an apr_ucs16 set of routines,
> apr_wstrcat, apr_wstrcpy, apr_wtolower, apr_wtoupper etc.,
> be received?

Well, since apr_isfoo apr_tofoo was 'reinvented', I don't see a
huge problem.

> on nt, it's easy: straightforward usage of the NT 
> wstrcat, wstrcpy etc. lines.

These are the folks who never read the "Security Implications" of ucs-8 
leaving 40% of all IIS webservers still vulnerable, so I'm dubious :-)

> on unix, it's slightly more tricky, but easily doable.
> [and example code exists in samba, anyway:
> they've tried it there, but never yet completed it
> satisfactorily]
>
> iirc, glib has a unicode library, however it is ucs32 not
> ucs16, and depends on glib, which is an N-mbytes install,
> and not what i need, iow.
> 
> how about it? :)

Well, how about a simple question.  Why restrain ourselves to ucs2?
(No such thing as ucs16/32, it's ucs2/4).

Can iconv/apr_iconv provide this in a charset-opaque manner?  That is, if
I want three 'characters' in shift-jis, can it give me the right number
of bytes?  The reason is simple, Unicode is already splintered into a
multi-word character set anyways.  I suspect it's easier to just get it
right, knowing the apr_xlate that's been opened, and asking for the char
len v.s. the byte len (sizeof) and providing the strcpy/cmp, etc.

Bill






Mime
View raw message