camel-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Half (JIRA)" <>
Subject [jira] [Commented] (CAMEL-11846) xtokenize and apply xslt to a string does not work with UTF-16BE
Date Thu, 09 Nov 2017 12:38:00 GMT


Robert Half commented on CAMEL-11846:

!my  example looks like this (and  it's really UTF-16BE).png!

> xtokenize and apply xslt to a string does not work  with UTF-16BE
> -----------------------------------------------------------------
>                 Key: CAMEL-11846
>                 URL:
>             Project: Camel
>          Issue Type: Bug
>          Components: camel-core
>    Affects Versions: 2.17.5
>            Reporter: Robert Half
>         Attachments: UTF-16BE (with BOM).png, my  example looks like this (and  it's
really UTF-16BE).png
> In XML, encoding is often provided inside <?xml ..?> tag. In general, you cannot
read the tag, if you don't know the encoding, but XML Parsers support the detection of several
encodings which allows them to read the tag. With that information they can read the whole
file without knowing the "charset" in first place.
> xtokenize and xslt use XmlInputFactory#createXmlStreamReader(Reader). But by providing
a reader Camel tells, that it knows the encoding, so it won't be detected by the XML parser.
> Also Camel sets the charset to UTF-8 if it is not provided inside a header. This makes
the underlying reader fail reading UTF-16.
> Using XmlInputFactory#createXmlStreamReader(InputStream) inside XMLTokenExpressionIterator
works (tried in a patch). But the next xslt steps fails again because it again uses a Reader.
> See Stackoverflow Question for reference:
> []

This message was sent by Atlassian JIRA

View raw message