camel-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Viral Gohel (JIRA)" <>
Subject [jira] [Commented] (CAMEL-11846) xtokenize and apply xslt to a string does not work with UTF-16BE
Date Tue, 03 Oct 2017 11:11:02 GMT


Viral Gohel commented on CAMEL-11846:

Hi Robert, 

Do you have any example or a reproducer application test case, by which you test this ? Since
the changes would not involve a single file fix, it would be more helpful if you can attach
a reproducer on how you are seeing the results. 

> xtokenize and apply xslt to a string does not work  with UTF-16BE
> -----------------------------------------------------------------
>                 Key: CAMEL-11846
>                 URL:
>             Project: Camel
>          Issue Type: Bug
>          Components: camel-core
>    Affects Versions: 2.17.5
>            Reporter: Robert Half
> In XML, encoding is often provided inside <?xml ..?> tag. In general, you cannot
read the tag, if you don't know the encoding, but XML Parsers support the detection of several
encodings which allows them to read the tag. With that information they can read the whole
file without knowing the "charset" in first place.
> xtokenize and xslt use XmlInputFactory#createXmlStreamReader(Reader). But by providing
a reader Camel tells, that it knows the encoding, so it won't be detected by the XML parser.
> Also Camel sets the charset to UTF-8 if it is not provided inside a header. This makes
the underlying reader fail reading UTF-16.
> Using XmlInputFactory#createXmlStreamReader(InputStream) inside XMLTokenExpressionIterator
works (tried in a patch). But the next xslt steps fails again because it again uses a Reader.
> See Stackoverflow Question for reference:
> []

This message was sent by Atlassian JIRA

View raw message