httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "William A. Rowe, Jr." <wr...@rowe-clan.net>
Subject RE: [addt'n] Unicode URL encoding
Date Thu, 05 Oct 2000 18:55:53 GMT

Sorry I never trimmed the first reply :-(

My only concern with the most fundimental translation to this new
scheme I propose is that we have the additional setup for what will
be very, very frequent invocations.  That's why I tuned the thing
(how well I haven't decided).  Every call to apr_open, every file 
stat, etc on win32 will need this function.

Agreed that we can change args, but I don't want to be comparing
strings to get at these two fundemental character sets.  If it needs
to be more sophisticated (old browsers like netscape 4.x that just
pass URLs in their current charset) then we need the whole existing
apr_xlate schema.  I expect to be spending some time in your code :-)

Bill

p.s. for anyone watching/playing with this... I've a slightly more
comprehensive testuct.c source attached... the first test had two
big holes I just filled in.

> -----Original Message-----
> From: William A. Rowe, Jr. [mailto:wrowe@rowe-clan.net]
> Sent: Thursday, October 05, 2000 1:33 PM
> To: new-httpd@apache.org
> Subject: RE: [addt'n] Unicode URL encoding
> 
> 
> So...
> 
> are you suggesting I commit the utf8<->ucs2 translation, but hook it 
> into the existing apr_xlate lingo?  Just checking :-)
> 
> Bill
> 
> > From: Jeff Trawick [mailto:trawickj@bellsouth.net]
> > Sent: Thursday, October 05, 2000 1:12 PM
> >
> > > +APR_EXPORT(const char*) apr_ucs2_from_utf8(apr_wchar_t 
> > *out, const char *in);
> > > +
> > > +APR_EXPORT(const apr_wchar_t*) apr_utf8_from_ucs2(char 
> > *in, const apr_wchar_t *out);
> > 
> > I would suggest a different API for this -- the one we already have.
> > 
> > For my own testing purposes, I integrated some custom translation
> > logic into apr_xlate, as shown below...  The mechanics of the
> > translation I added are not complete (just good enough to test some
> > interesting cases on my laptop).  Also, there are better ways to
> > integrate it (like storing a function pointer instead of using the
> > goofy builtin_to16 and builtin_from16 flags).
> > 
> > The reason I hacked this code in probably applies to your translation:
> > not all iconv() implementations are created equal, and I often use one
> > (not-new-enough glibc) that didn't do the translation I wanted.  I
> > didn't want to change mod_charset_lite to use more than one API, so I
> > changed APR.
> > 
> > The way this hardcoded support was added^H^H^H^H^Hhacked in, iconv
> > support is not required on the platform (subject to a buglet or two).
> > 
> > Index: lib/apr/i18n/unix/xlate.c
> > ===================================================================
> > +    if (!strcmp(frompage, "ISO-8859-1") &&
> > +        !strcmp(topage,   "UTS-16"))
> > +    else if (!strcmp(topage,     "ISO-8859-1") &&
> > +             !strcmp(frompage,   "UTS-16"))

Mime
View raw message