tomcat-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rashmi Rubdi" <rashmi....@gmail.com>
Subject Re: charset encoding bug
Date Tue, 24 Apr 2007 18:39:55 GMT
Hi Sean,

Thank you for defining the problem.

I tried a few variations of code in Servlets and JSPs and was able to
get only "application/xml" instead of "application/xml;some character
encoding" .

The only time I got "application/xml;some character encoding" was when
there was a conflicting setting in the JSP page.

For example in the following case the character set was appended,
because if you notice in the page directive, there's a conflict: <%@
page contentType="text/html;charset=UTF-8" language="java" %> with the
explicit response set in the body.

~~~~~~~~~~~~~~~~~~~~~~~~~~
FirstTest.jsp
~~~~~~~~~~~~~~~~~~~~~~~~~~
<%@ page contentType="text/html;charset=UTF-8" language="java" %>
<html>
  <head><title></title></head>
  <body>
  Set Content type:
  <%
      response.setContentType("application/xml");
  %>
  <br/><br/>
  Get Content type:
  <%=response.getContentType()%>
  </body>
</html>

The output was:
- <html>
- <head>
  <title />
  </head>
- <body>
  Set Content type:
  <br />
  <br />
  Get Content type: application/xml;charset=UTF-8
  </body>
  </html>


~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
SecondTest.jsp
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
If I removed all all conflicting content-types and made them uniform as follows:

<%@ page contentType="application/xml" language="java" %>
<html>
  <head><title></title></head>
  <body>
  Set Content type:
  <%
      response.setContentType("application/xml");
      response.setLocale(null);
  %>
  <br/><br/>
  Get Content type:
  <%=response.getContentType()%>
  </body>
</html>

gives the following output:

- <html>
- <head>
  <title />
  </head>
- <body>
  Set Content type:
  <br />
  <br />
  Get Content type: application/xml
  </body>
  </html>

However, removing the character set, resulted in an error on Tomcat's console:
java.lang.NullPointerException
        at org.apache.catalina.util.CharsetMapper.getCharset(CharsetMapper.java:106)

Researching a little bit on the HTTP Content-Type header lead me to
http://www.ietf.org/rfc/rfc3023.txt on page 6 it states:

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
3.1 Text/xml Registration

   MIME media type name: text

   MIME subtype name: xml

   Mandatory parameters: none

   Optional parameters: charset

      Although listed as an optional parameter, the use of the charset
      parameter is STRONGLY RECOMMENDED, since this information can be
      used by XML processors to determine authoritatively the character
      encoding of the XML MIME entity.  The charset parameter can also
      be used to provide protocol-specific operations, such as charset-
      based content negotiation in HTTP.  "utf-8" [RFC2279] is the
      recommended value, representing the UTF-8 charset.  UTF-8 is
      supported by all conforming processors of [XML].
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The above might explain why Tomcat expects the character set parameter
to be appended.

-Regards
Rashmi

---------------------------------------------------------------------
To start a new topic, e-mail: users@tomcat.apache.org
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Mime
View raw message