httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Graham Leggett <minf...@sharp.fm>
Subject Re: mod_include: echo, entity encoding and UTF-8
Date Sat, 18 Sep 2010 18:00:26 GMT
On 18 Sep 2010, at 7:10 PM, Graham Leggett wrote:

> When the SSI tag below is handled, the value of the string output to  
> the browser is entity encoded:
>
> <!--#echo encoding="entity" var="MY_VAR"-->
>
> This is done with a line that looks something like this:
>
> /* PR#25202: escape anything non-ascii here */
> echo_text = ap_escape_html2(ctx->dpool, val, 1);
>
> The problem with the above is the parameter "1", which means that  
> non-ASCII characters are entity encoded as html escape sequences,  
> and in the process anything encoded with UTF-8 (and is not ASCII)  
> breaks.

Looking further at PR25202, this caused a regression described in  
PR47686 where UTF-8 support broke.

I've created a fix for this, where the "set" and "echo" SSI command  
have been taught to handle "encoding" and "decoding" parameters.

For both echo and for set, the value is first decoded by the given  
parameter, and then encoded by the given parameter. This allows full  
control of the encoding and decoding of variables and echoed  
parameters, depending on where they came from.

Encoding and decoding can contain multiple values, so that you can for  
example strip off urlencoding, then entity encoding before using a  
value, like this: decoding="url,entity".

Regards,
Graham
--

Mime
View raw message