Dear Wiki user,
You have subscribed to a wiki page or wiki category on "Tomcat Wiki" for change notification.
The "Tomcat/UTF-8" page has been changed by KonstantinKolinko.
The comment on this change is: Removed all content of the page. The up-to-date version of
all this is in FAQ/CharacterEncoding..
http://wiki.apache.org/tomcat/Tomcat/UTF-8?action=diff&rev1=13&rev2=14
--------------------------------------------------
+ This page is obsolete. See [[FAQ/CharacterEncoding|FAQ/CharacterEncoding]] for the up-to-date
version.
- 1.
- JSP pages must include the header:
+ ----
+ CategoryObsolete
- {{{ <%@ page
- contentType="text/html; charset=UTF-8"
- %> }}}
- 2.
- For translation of inputs coming back from the browser there must be a
- method that translates from the browser's ISO-8859-1 to UTF-8. ISO-8859-1
- is the default character encoding for servers and browsers according to the
- [[http://www.ietf.org/rfc/rfc2616.txt|HTTP specification]] section 3.4.1.
-
- {{{ /**
- * Convert ISO-8859-1 format string (which is the default sent by IE
- * to the UTF-8 format that the database is in.
- */
- public String toUTF8(String isoString)
- {
- String utf8String = null;
- if (null != isoString && !isoString.equals(""))
- {
- try
- {
- byte[] stringBytesISO = isoString.getBytes("ISO-8859-1");
- utf8String = new String(stringBytesISO, "UTF-8");
- }
- catch(UnsupportedEncodingException e)
- {
- throw new RuntimeException(e);
- }
- }
- else
- {
- utf8String = isoString;
- }
- return utf8String;
- } }}}
- I have found that these three steps are all that is necessary to make your
- site accept any language that UTF-8 can work with. I extend my thanks to
- those of you on the Tomcat users list who helped me find these little gems.
-
- (from the tomcat-user mailing list)
-
- '''Note''' This method is not useful because it doesn't work with non-ASCII character. "stringBytesISO"
is an ISO-8859-1 byte stream. We can't use it as an UTF-8 byte stream if it contains non-ASCII
character.
-
- '''Alternative solution'''
-
- The solution suggested above works, but from the architecture perspective the correct way
is to add a filter to the Tomcat that will do necessary correction for the application deployed
without any additional changes to the rest of the code.
-
- 1. Make sure JSP header is set as suggested:
- {{{
- <%@ page contentType="text/html; charset=UTF-8"%>
- }}}
-
- 2. Example of filter:
-
- {{{import java.io.*;
- import java.util.*;
- import javax.servlet.*;
- import javax.servlet.http.*;
-
- public class CharsetFilter implements Filter
- {
- private String encoding;
-
- public void init(FilterConfig config) throws ServletException
- {
- encoding = config.getInitParameter("requestEncoding");
-
- if( encoding==null ) encoding="UTF-8";
- }
-
- public void doFilter(ServletRequest request, ServletResponse response, FilterChain next)
- throws IOException, ServletException
- {
- // Respect the client-specified character encoding
- // (see HTTP specification section 3.4.1)
- if(null == request.getCharacterEncoding())
- request.setCharacterEncoding(encoding);
-
- next.doFilter(request, response);
- }
-
- public void destroy(){}
- }
- }}}
-
- Corresponding portion of web.xml configuration will look like:
-
- {{{ <!--CharsetFilter start-->
-
- <filter>
- <filter-name>Charset Filter</filter-name>
- <filter-class>CharsetFilter</filter-class>
- <init-param>
- <param-name>requestEncoding</param-name>
- <param-value>UTF-8</param-value>
- </init-param>
- </filter>
-
- <filter-mapping>
- <filter-name>Charset Filter</filter-name>
- <url-pattern>/*</url-pattern>
- </filter-mapping>
-
- <!--CharsetFilter end-->}}}
-
- The suggested solution originates from [[http://people.comita.spb.ru/users/sergeya/java/ruschars.html|Sergey
Astakhov (all texts are in russian)]] (sergeya@comita.spb.ru)
-
- '''Important note''': Note that this filter should be as far towards the front of your filter
chain as possible. If some other code calls request.getParameter (or a similar method) before
this filter is invoked, then the encoding will not be set properly, and your parameters will
still be decoded improperly.
-
- '''- TIP -'''
-
- Update the file $CATALINA_HOME/conf/server.xml for UTF-8 support by connectors.
- Example:
-
- {{{<Connector port="8080"
- URIEncoding="UTF-8"/>}}}
-
- or
-
- {{{<Connector port="8080"
- useBodyEncodingForURI="true"/>}}}
-
- * ''URIEncoding'' specifies the character encoding used to decode the URI.
- * ''useBodyEncodingForURI'' indicates whether to use the encoding specified in contentType
(or explicitly set using Request.setCharacterEncoding() method) to decode the URI query parameters.
The default value is set to "false".
-
- '''Note''' that this changes the behavior of reading GET parameters from the request URI
and will not affect POST parameters at all.
-
- == See Also ==
- * http://wiki.apache.org/tomcat/Tomcat/UTF-8
- * http://java.sun.com/developer/technicalArticles/Intl/HTTPCharset/
-
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org
|