Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 27F78200CB6 for ; Thu, 15 Jun 2017 08:13:11 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 26A08160BE8; Thu, 15 Jun 2017 06:13:11 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id F3D1B160BDB for ; Thu, 15 Jun 2017 08:13:09 +0200 (CEST) Received: (qmail 60591 invoked by uid 500); 15 Jun 2017 06:13:09 -0000 Mailing-List: contact issues-help@cxf.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cxf.apache.org Delivered-To: mailing list issues@cxf.apache.org Received: (qmail 60542 invoked by uid 99); 15 Jun 2017 06:13:08 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 15 Jun 2017 06:13:08 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 2C930180594 for ; Thu, 15 Jun 2017 06:13:08 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -100.002 X-Spam-Level: X-Spam-Status: No, score=-100.002 tagged_above=-999 required=6.31 tests=[RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id t7fdxc04F9c1 for ; Thu, 15 Jun 2017 06:13:05 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id E608D5F20C for ; Thu, 15 Jun 2017 06:13:04 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id A9C92E0069 for ; Thu, 15 Jun 2017 06:13:02 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id E77B421D8E for ; Thu, 15 Jun 2017 06:13:01 +0000 (UTC) Date: Thu, 15 Jun 2017 06:13:00 +0000 (UTC) From: "Artyom Burylov (JIRA)" To: issues@cxf.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (CXF-7408) Problem with a response encoding MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Thu, 15 Jun 2017 06:13:11 -0000 [ https://issues.apache.org/jira/browse/CXF-7408?page=3Dcom.atlassian.= jira.plugin.system.issuetabpanels:all-tabpanel ] Artyom Burylov updated CXF-7408: -------------------------------- Description:=20 Hello, I have a problem with encoding using Apache CXF. I send a request to an external SOAP Service and then I get a response with= out *charset *in HTTP header *Content-Type*. The service doesn't send it. Apache CXF decides it's *ISO-8859-1* encoded. But actually, the content is = encoded in *UTF-8* and has latinic and cyrillic characters. As a result, I get non-valid values. There is an example of a response with invalid encoding. *Http headers:* HTTP/1.1 200 OK Server: nginx Date: Thu, 15 Jun 2017 05:01:50 GMT Content-Type: text/xml Transfer-Encoding: chunked Connection: keep-alive Vary: Accept-Encoding Content-Encoding: gzip *A TEST SOAP RESPONSE * _(Invalid values in ns12:PaymentDocumentID, ns13:region, ns13:city, ns13:ad= dress_string and so on)_ {code} 2017-06-15T08:05:56.336+03:00 a29d26c2-f2d1-48ea-be11-a47bd175b40a 3 4fcb1240-5188-11e7-a67f-005056b6513d 40=C3=90=C3=901= 64719-01-7051 10 40=C3=90=C3=90= ;164719 40=C3=90=C3=90164719= 40=C3=90=C3=90164719-01<= /ns5:ServiceID> =C3=90=C2=AF=C3=91€= =C3=90=C2=BE=C3=91=C3=90=C2=BB=C3=90=C2=B0=C3=90=C2=B2=C3=91=C3= =90=C2=BA=C3=90=C2=B0=C3=91 =C3=90=C2=AF=C3=91€=C3= =90=C2=BE=C3=91=C3=90=C2=BB=C3=90=C2=B0=C3=90=C2=B2=C3=90=C2=BB=C3=91= Œ 34=C3=90=C2=B0 3d0978ee-6d63-468a-= 9167-dac0bf36a1bc 12 150029, =C3=90=C2= =BE=C3=90=C2=B1=C3=90=C2=BB. =C3=90=C2=AF=C3=91€=C3=90=C2=BE=C3=91 = 9;=C3=90=C2=BB=C3=90=C2=B0=C3=90=C2=B2=C3=91=C3=90=C2=BA=C3=90=C2=B0= =C3=91, =C3=90=C2=B3. =C3=90=C2=AF=C3=91€=C3=90=C2=BE=C3=91= ;=C3=90=C2=BB=C3=90=C2=B0=C3=90=C2=B2=C3=90=C2=BB=C3=91Œ, =C3=90=C2=B4= . 34=C3=90=C2=B0 3808008510 380801001 =C3=90=C2=A2=C3=90=C2=B5=C3= =91=C3=91‚=C3=90=C2=BE=C3=90=C2=B2=C3=90=C2=B0=C3=91 =C3= =90=C2=BE=C3=91€=C3=90=C2=B3=C3=90=C2=B0=C3=90=C2=BD=C3=90=C2=B8=C3=90= =C2=B7=C3=90=C2=B0=C3=91†=C3=90=C2=B8=C3=911 3808008510 380801001 =C3=90Ÿ=C3=90= =C3=90ž =C3=90=C2=A1=C3=90‘=C3=90•=C3=90=C2=A0=C3=90‘= =C3=90=C3=90=C3=90š =C3=90=C2=A2=C3= =90=C2=B5=C3=91=C3=91‚=C3=90=C2=BE=C3=90=C2=B2=C3=90=C2=B0=C3=91=  =C3=90=C2=BE=C3=91€=C3=90=C2=B3=C3=90=C2=B0=C3=90=C2=BD=C3=90= =C2=B8=C3=90=C2=B7=C3=90=C2=B0=C3=91†=C3=90=C2=B8=C3=911 044525225 4070381000= 0020105994 30101810= 400000000225 komarov-ev@yandex.ru 1155000.00 40=C3=90=C3=90164719-= 01-7051 2017 5 {code} 1) Why does apache CXF ignore a= nd why UTF-8 is not default encoding? 2) How can I process a response as UTF-8 encoded even without charset=3Dutf= -8 in Content-Type header? I use Apache CXF together with Wildfly 10.1.0.FINAL, but if I use only Apac= he CXF - the same problem happens. --- Also I looked at the implementation.=20 Inside *HTTPConduit* I found the following code (*handleResponseInternal* m= ethod): {code} String charset =3D HttpHeaderHelper.findCharset((String)inMessage.get(Messa= ge.CONTENT_TYPE)); String normalizedEncoding =3D HttpHeaderHelper.mapCharset(charset); {code} If no *charset* in *ContentType* Header (in Response) than *normalizedEncod= ing* is *ISO-8859-1*.=20 If I set the value *UTF-8* in the debug mode It works fine and I get valid = result with cyrillic characters. was: Hello, I have a problem with encoding using Apache CXF. I send a request to an external SOAP Service and then I get a response with= out *charset *in HTTP header *Content-Type*. The service doesn't send it. Apache CXF decides it's *ISO-8859-1* encoded. But actually, the content is = encoded in *UTF-8* and has latinic and cyrillic characters. As a result, I get non-valid values. There is an example of a response with invalid encoding. *Http headers:* HTTP/1.1 200 OK Server: nginx Date: Thu, 15 Jun 2017 05:01:50 GMT Content-Type: text/xml Transfer-Encoding: chunked Connection: keep-alive Vary: Accept-Encoding Content-Encoding: gzip * A TEST SOAP RESPONSE * _(Invalid values in ns12:PaymentDocumentID, ns13:region, ns13:city, ns13:ad= dress_string and so on)_ {code} 2017-06-15T08:05:56.336+03:00 a29d26c2-f2d1-48ea-be11-a47bd175b40a 3 4fcb1240-5188-11e7-a67f-005056b6513d 40=C3=90=C3=901= 64719-01-7051 10 40=C3=90=C3=90= ;164719 40=C3=90=C3=90164719= 40=C3=90=C3=90164719-01<= /ns5:ServiceID> =C3=90=C2=AF=C3=91€= =C3=90=C2=BE=C3=91=C3=90=C2=BB=C3=90=C2=B0=C3=90=C2=B2=C3=91=C3= =90=C2=BA=C3=90=C2=B0=C3=91 =C3=90=C2=AF=C3=91€=C3= =90=C2=BE=C3=91=C3=90=C2=BB=C3=90=C2=B0=C3=90=C2=B2=C3=90=C2=BB=C3=91= Œ 34=C3=90=C2=B0 3d0978ee-6d63-468a-= 9167-dac0bf36a1bc 12 150029, =C3=90=C2= =BE=C3=90=C2=B1=C3=90=C2=BB. =C3=90=C2=AF=C3=91€=C3=90=C2=BE=C3=91 = 9;=C3=90=C2=BB=C3=90=C2=B0=C3=90=C2=B2=C3=91=C3=90=C2=BA=C3=90=C2=B0= =C3=91, =C3=90=C2=B3. =C3=90=C2=AF=C3=91€=C3=90=C2=BE=C3=91= ;=C3=90=C2=BB=C3=90=C2=B0=C3=90=C2=B2=C3=90=C2=BB=C3=91Œ, =C3=90=C2=B4= . 34=C3=90=C2=B0 3808008510 380801001 =C3=90=C2=A2=C3=90=C2=B5=C3= =91=C3=91‚=C3=90=C2=BE=C3=90=C2=B2=C3=90=C2=B0=C3=91 =C3= =90=C2=BE=C3=91€=C3=90=C2=B3=C3=90=C2=B0=C3=90=C2=BD=C3=90=C2=B8=C3=90= =C2=B7=C3=90=C2=B0=C3=91†=C3=90=C2=B8=C3=911 3808008510 380801001 =C3=90Ÿ=C3=90= =C3=90ž =C3=90=C2=A1=C3=90‘=C3=90•=C3=90=C2=A0=C3=90‘= =C3=90=C3=90=C3=90š =C3=90=C2=A2=C3= =90=C2=B5=C3=91=C3=91‚=C3=90=C2=BE=C3=90=C2=B2=C3=90=C2=B0=C3=91=  =C3=90=C2=BE=C3=91€=C3=90=C2=B3=C3=90=C2=B0=C3=90=C2=BD=C3=90= =C2=B8=C3=90=C2=B7=C3=90=C2=B0=C3=91†=C3=90=C2=B8=C3=911 044525225 4070381000= 0020105994 30101810= 400000000225 komarov-ev@yandex.ru 1155000.00 40=C3=90=C3=90164719-= 01-7051 2017 5 {code} 1) Why does apache CXF ignore a= nd why UTF-8 is not default encoding? 2) How can I process a response as UTF-8 encoded even without charset=3Dutf= -8 in Content-Type header? I use Apache CXF together with Wildfly 10.1.0.FINAL, but if I use only Apac= he CXF - the same problem happens. --- Also I looked at the implementation.=20 Inside *HTTPConduit* I found the following code (*handleResponseInternal* m= ethod): {code} String charset =3D HttpHeaderHelper.findCharset((String)inMessage.get(Messa= ge.CONTENT_TYPE)); String normalizedEncoding =3D HttpHeaderHelper.mapCharset(charset); {code} If no *charset* in *ContentType* Header (in Response) than *normalizedEncod= ing* is *ISO-8859-1*.=20 If I set the value *UTF-8* in the debug mode It works fine and I get valid = result with cyrillic characters. > Problem with a response encoding > -------------------------------- > > Key: CXF-7408 > URL: https://issues.apache.org/jira/browse/CXF-7408 > Project: CXF > Issue Type: Bug > Components: Core > Affects Versions: 3.1.6, 3.1.11 > Environment: OS: Ubuntu 14.04.5 LTS > Application server: Wildfly 10.1.0.FINAL > Reporter: Artyom Burylov > > Hello, I have a problem with encoding using Apache CXF. > I send a request to an external SOAP Service and then I get a response wi= thout *charset *in HTTP header *Content-Type*. The service doesn't send it. > Apache CXF decides it's *ISO-8859-1* encoded. But actually, the content i= s encoded in *UTF-8* and has latinic and cyrillic characters. > As a result, I get non-valid values. > There is an example of a response with invalid encoding. > *Http headers:* > HTTP/1.1 200 OK > Server: nginx > Date: Thu, 15 Jun 2017 05:01:50 GMT > Content-Type: text/xml > Transfer-Encoding: chunked > Connection: keep-alive > Vary: Accept-Encoding > Content-Encoding: gzip > *A TEST SOAP RESPONSE * > _(Invalid values in ns12:PaymentDocumentID, ns13:region, ns13:city, ns13:= address_string and so on)_ > {code} > > > > 2017-06-15T08:05:56.336+03:00 > a29d26c2-f2d1-48ea-be11-a47bd175b40a > > > > 3 > 4fcb1240-5188-11e7-a67f-005056b6513d > > > > 40=C3=90=C3=90= ;164719-01-7051 > 10 > 40=C3=90=C3=90= 44;164719 > 40=C3=90=C3=901647= 19 > 40=C3=90=C3=90164719-0= 1 > > > > =C3=90=C2=AF=C3=91€= =C3=90=C2=BE=C3=91=C3=90=C2=BB=C3=90=C2=B0=C3=90=C2=B2=C3=91=C3= =90=C2=BA=C3=90=C2=B0=C3=91 > =C3=90=C2=AF=C3=91€= =C3=90=C2=BE=C3=91=C3=90=C2=BB=C3=90=C2=B0=C3=90=C2=B2=C3=90=C2=BB=C3= =91Œ > 34=C3=90=C2=B0 > 3d0978ee-6d63-468= a-9167-dac0bf36a1bc > 12 > 150029, =C3=90= =C2=BE=C3=90=C2=B1=C3=90=C2=BB. =C3=90=C2=AF=C3=91€=C3=90=C2=BE=C3=91&= #129;=C3=90=C2=BB=C3=90=C2=B0=C3=90=C2=B2=C3=91=C3=90=C2=BA=C3=90=C2= =B0=C3=91, =C3=90=C2=B3. =C3=90=C2=AF=C3=91€=C3=90=C2=BE=C3=91&#= 129;=C3=90=C2=BB=C3=90=C2=B0=C3=90=C2=B2=C3=90=C2=BB=C3=91Œ, =C3=90=C2= =B4. 34=C3=90=C2=B0 > > > > 3808008510 > > 380801001 > =C3=90=C2=A2=C3=90=C2=B5= =C3=91=C3=91‚=C3=90=C2=BE=C3=90=C2=B2=C3=90=C2=B0=C3=91 = =C3=90=C2=BE=C3=91€=C3=90=C2=B3=C3=90=C2=B0=C3=90=C2=BD=C3=90=C2=B8=C3= =90=C2=B7=C3=90=C2=B0=C3=91†=C3=90=C2=B8=C3=911 > > > 3808008510 > 380801001 > =C3=90Ÿ=C3=90= 4;=C3=90ž =C3=90=C2=A1=C3=90‘=C3=90•=C3=90=C2=A0=C3=90‘= =C3=90=C3=90=C3=90š > =C3=90=C2=A2= =C3=90=C2=B5=C3=91=C3=91‚=C3=90=C2=BE=C3=90=C2=B2=C3=90=C2=B0=C3= =91 =C3=90=C2=BE=C3=91€=C3=90=C2=B3=C3=90=C2=B0=C3=90=C2=BD=C3= =90=C2=B8=C3=90=C2=B7=C3=90=C2=B0=C3=91†=C3=90=C2=B8=C3=911 > 044525225 > 40703810= 000020105994 > 301018= 10400000000225 > > komarov-ev@yandex.ru= > > 1155000.00 > 40=C3=90=C3=9016471= 9-01-7051 > > 2017 > 5 > > > > > > > {code} > 1) Why does apache CXF ignore = and why UTF-8 is not default encoding? > 2) How can I process a response as UTF-8 encoded even without charset=3Du= tf-8 in Content-Type header? > I use Apache CXF together with Wildfly 10.1.0.FINAL, but if I use only Ap= ache CXF - the same problem happens. > --- > Also I looked at the implementation.=20 > Inside *HTTPConduit* I found the following code (*handleResponseInternal*= method): > {code} > String charset =3D HttpHeaderHelper.findCharset((String)inMessage.get(Mes= sage.CONTENT_TYPE)); > String normalizedEncoding =3D HttpHeaderHelper.mapCharset(charset); > {code} > If no *charset* in *ContentType* Header (in Response) than *normalizedEnc= oding* is *ISO-8859-1*.=20 > If I set the value *UTF-8* in the debug mode It works fine and I get vali= d result with cyrillic characters. -- This message was sent by Atlassian JIRA (v6.4.14#64029)