httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "John Dougrez-Lewis" <jle...@lightblue.com>
Subject RE: Apache Module Development Query on character encodings.
Date Wed, 21 Oct 2015 06:04:27 GMT
Hi Nick,

> Hi, are you by any chance the Raving Loony I once knew at Cambridge?

Yes indeed - that must be 35 years ago now - these days I'm a bit more
sensible (although the legacy of the OMRLP lives on).


> Basically there are three parts to working with character encodings:
>  * Detecting them in incoming data.
>  * Converting them to order.
>  * Correctly labelling outgoing data.

> mod_xml2enc will do all that for libxml2-based filters, and could easily
be tweaked to drop the libxml2-specific optimisations for general-
> purpose use.  Alternatively the charset-detection from mod_xml2enc could
probably be folded into mod_charset_lite.

So basically mod_xml2enc will detect the incoming encoding (whatever it may
be)?


Are there not HTTP headers which give a good indication of the input format
(albeit that you have to detect the format and read the stream to confirm
it)?


I'm new to Apache coding/configuration - how would xml2enc/mod_charset_lite
input & output modules/filters be setup in configuration and/or chained in
code?


Do you have any views on libxml2 suitability for use within Apache module
code?

It appears to have good all-round performance compared to other XML
libraries. I note that it has a C++ wrapper which is LGPL'ed so there are
likely to licensing/distribution issues if I ever decided to try release
code under an Apache License.



Regards,

John


Mime
View raw message