httpd-apreq-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joe Schaefer <>
Subject Re: today's code review..
Date Thu, 10 Mar 2005 22:40:47 GMT
Joe Schaefer <> writes:

> Max Kellermann <> writes:


>> - we should define what apreq_decode() does when it sees 8 bit ASCII
>>   escapes and unicode escapes in one string; the current behaviour
>>   (ignoring that) isn't a good solution. Maybe mark with that utf8
>>   flag? (could also solve the previous item on my list)
> Hmm, tough question.  If we start marking things as utf8, we need to
> do it accurately.  Can you be sure the unicode escapes don't come from
> some other UTF encoding?

IIRC the current url_decode implementation treats %uXXXX as the
numeric 16 bit codepoint, and it decodes that into a utf8 character.  
The %uXXXX comes from the escape() function of ecma-262 B.2.1,
so we can be confident about the utf8-ness of the result whenever
such an encoding is detected (in the query string, obviously).

But beyond that, I'm not sure what we can do right now in apreq.
The %uXXXX escapes are rare enough that I don't think we should
adjust our decode-APIs around them.

Joe Schaefer

View raw message