tomcat-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From bugzi...@apache.org
Subject [Bug 61197] New: Breaking change in Content-Type / Character Encoding handling
Date Mon, 19 Jun 2017 00:19:34 GMT
https://bz.apache.org/bugzilla/show_bug.cgi?id=61197

            Bug ID: 61197
           Summary: Breaking change in Content-Type / Character Encoding
                    handling
           Product: Tomcat 8
           Version: 8.5.15
          Hardware: All
                OS: All
            Status: NEW
          Severity: regression
          Priority: P2
         Component: Catalina
          Assignee: dev@tomcat.apache.org
          Reporter: matthew@matt-shaw.co.uk
  Target Milestone: ----

I *believe* this constitutes some level of regression, based on distinct
difference from prior behaviour, but please correct me if I'm wrong :) Also I
couldn't find any clear mention of this change in the change log for 8.5.15.

Prior to 8.5.15 (specifically, this commit:
https://github.com/apache/tomcat/commit/b2bab804b543bfe181fe435efe35628ce0e21b39)
the behaviour of `org.apache.catalina.connector.Response` when setting the
content-type with encoding parameter included, e.g.
`setContentType("application/json;charset=MS932")`, was to simply take the
provided encoding string and set this for the output.

As long as the character set was supported by the JVM (as a specific code page,
or an alias of one of the supported code pages), requests would return with the
*exact* character set string provided.

Since the above commit / 8.5.15 release, this is now forcibly modified with no
option to disable such behaviour. For instance, if I specify "MS932" or
"windows-932" this is replaced now with "windows-31j" , or "eucjis" with
"EUC-JP", "sjis" with "Shift-JIS", etc.

This may seem like a reasonable behaviour for modern systems that we would
*hope* support mapping aliased encodings, but with legacy systems unable to
handle this (and any system that, stupidly or otherwise, checks for a specific
encoding string, possibly in a case-sensitive manner), suddenly we have broken
behaviour. The client expects one encoding string and receives something
equivalent but that it just can't handle.

Unfortunately I'm now stuck in this situation as a legacy-systems integrations
engineer. We *have* to be able to provide our output with very specific
encoding strings set or else several dozen systems we (sadly) can't change will
break. Thankfully we caught this in internal testing of the upgrade to 8.5.15
and can put it off temporarily, but we're now also stuck with either needing to
maintain our own patched version of Tomcat to revert this behaviour, not
continue updating (not a real option given security requirements), or possibly
review migrating to an alternative servlet container (please no q_q).

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


Mime
View raw message