cocoon-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christopher Schultz <ch...@christopherschultz.net>
Subject Re: [2.1] Overzealous escaping of high Unicode code points
Date Tue, 20 Jun 2017 20:14:36 GMT
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

Greg,

On 6/20/17 4:11 PM, Christopher Schultz wrote:
> Greg,
> 
> On 6/8/17 2:17 PM, gelo1234 wrote:
>> Chris,
> 
>> Even with C3 (cocoon 3.0 beta) unless you specify optional
>> encoding in your Serializer config, you fallback to default
>> UTF-8:
> 
>> org.apache.cocoon.optional.servlet.components.sax.serializers.util
>
>>  public class ConfigurationUtils {
> 
>> private ConfigurationUtils() { }
> 
>> public static String getEncoding(Map<String, ? extends Object> 
>> configuration) { String encoding = (String) 
>> configuration.get("encoding");
> 
>> if (encoding == null || "".equals(encoding)) { encoding =
>> "UTF-8"; }
> 
>> return encoding; } ...
> 
> I would have expected the Unicode codepoint to be converted into a 
> single 4-byte UTF-8 byte without any &-encoding at all. It looks
> like what I got was a pair of 2-byte characters with &-encoding.
> 
> I'll try UTF-16 but my expectation is that it's going to get
> worse, not better.

Interestingly enough, my emojis are now showing (which I don't totally
understand why!) but it looks like my CSS aren't being loaded. That's
a separate problem I'll have to figure out for myself.

In my own application, switching from commons-lang to commans-lang3
HTML/XML escaping allowed me to use these 4-byte emojis and UTF-8
together. I'm surprised that Cocoon can't do the same thing. (I think
it comes down to exactly how the character-escaper makes its decisions).

Thanks,
- -chris
-----BEGIN PGP SIGNATURE-----
Comment: GPGTools - http://gpgtools.org
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQIzBAEBCAAdFiEEMmKgYcQvxMe7tcJcHPApP6U8pFgFAllJgiwACgkQHPApP6U8
pFgJkRAAqiXn7DWNDN41m1V98aI5xWjTuoka0tKcadN1IUGemTZwipaXHtYQcois
6yuI3st31ZuanghIpRPcBu9pZzuHtOSBVSHZSIhDGqPwYgczScQ2LgnfMi6zwAdd
j2LFlSWtKGjgCczV5Ok56PyMq1BEAOVw96vmF5xfXmpLAyNA/PvLKsncoW4pN+ES
1MQMm1aPwbmEpWz7ykReUzfauwBtL4rEX1wO3pl88m9Wq3x174AKHWs/a+4Z1Hdq
0CnxfrdTK50p7Ng+ECfnPwx8y1Em64lA7KKMuz2jTd0PnxlpZTAgO6lq8S7BdSeY
H1lwBJojVT/+m2w8b9OC/XoyiAyiC/zIswQ3TSMA3ZC2SnCxxAXMTsmT49Ql+lyq
01JRCIVMitKeoKI4I4066oaBW91FpSSpZXX14XCHrMBtKnIJI+NxBnI++eQq8wdi
ZdX3GzLF2zaPHvZMSz4DRskR1xKGLsAxZAukINW3AGrEAZ/GwbPd76ml3YJam5Yy
R31u0kcRJl4z79pd1n46yxB66V10Rn5IkSMQ8R7uK/ht9wLi5T8bkeAoLjZFFoyq
awmfQTbJzquXAtwjX99WKWEzviN2ph+P0h2rBInHnos5ud8IlLjcS7FmdxQ4DNOw
Nirmj7cikxcr2Fn22pGQh6o3/Eph0lMf1d1HjUZ1C7SchEgsqrk=
=0nTd
-----END PGP SIGNATURE-----

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org


Mime
View raw message