cxf-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Freeman Fang (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (CXF-6351) Character encoding error in XML schema validation
Date Fri, 15 May 2015 09:11:00 GMT

     [ https://issues.apache.org/jira/browse/CXF-6351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Freeman Fang resolved CXF-6351.
-------------------------------
       Resolution: Fixed
    Fix Version/s: 3.1.0
                   2.7.16
                   3.0.5

> Character encoding error in XML schema validation
> -------------------------------------------------
>
>                 Key: CXF-6351
>                 URL: https://issues.apache.org/jira/browse/CXF-6351
>             Project: CXF
>          Issue Type: Bug
>    Affects Versions: 3.0.4, 2.7.15
>         Environment: JVM using platform default encoding other than UTF-8
>            Reporter: Martin Bonato
>            Assignee: Daniel Kulp
>             Fix For: 3.0.5, 2.7.16, 3.1.0
>
>
> I'm using a WSDL with referenced XML schema files containing german umlaut characters.
The WSDL and schema files reside in the applications classpath. When XML schema validation
is enabled for the corresponding webservice, the schema validation fails if the platform default
encoding is not UTF-8 (e.g. ISO-8859-1).
> I've created a test case for the issue (tested with CXF 2.7.15) https://github.com/datentechnik/cxf-schema-encoding
> When the test case is executed with a {{-Dfile.encoding=ISO-8859-1}} it fails with:
> {noformat}
> org.apache.cxf.interceptor.Fault: Could not parse the XML stream caused by: javax.xml.stream.XMLStreamException:
cvc-enumeration-valid: Value 'm�nnlich' is not facet-valid with respect to enumeration '[männlich,
weiblich, unbekannt]'. It must be a value from the enumeration.
> {noformat}
> The reason is, schema references in WSDL files are read using platform default encoding:
> {code:title=org.apache.cxf.wsdl.EndpointReferenceUtils.SchemaLSResourceResolver}
>         private LSInputImpl createInput(String newId, byte[] value) {
>             LSInputImpl impl = new LSInputImpl();
>             impl.setSystemId(newId);
>             impl.setBaseURI(newId);
>             impl.setCharacterStream(
>                 new InputStreamReader(
>                     new ByteArrayInputStream(value)));
>             return impl;
>         }
> {code}
> The {{InputStreamReader}} uses the default platform character encoding. I would recommend
to set the InputStream in LSInputImpl instead of the CharacterStream and let the Schema parser
decide on the character encoding.
> I've created a pull request https://github.com/apache/cxf/pull/65 which solves the problem
(tested with CXF 2.7.x).
> I've only tested with the 2.7.x branch, but from the code I think master is affected
as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message