sling-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Felix Meschberger <fmesc...@adobe.com>
Subject Re: request.getCharacterEncoding() always returns ISO-8859-1
Date Mon, 28 Feb 2011 10:08:35 GMT
Hi,

I have implemented this support in trunk (see SLING-1998 [1]) and
described it on the Request Parameter Handling page  [2].

Regards
Felix

[1] https://issues.apache.org/jira/browse/SLING-1998
[2] http://sling.apache.org/site/request-parameters.html

Am Freitag, den 25.02.2011, 16:12 +0000 schrieb Felix Meschberger: 
> Hi,
> 
> The problem is that browsers tend to not tell the character encoding
> used when posting data ... Don't ask me why ;-)
> 
> So we have to do guessing, something I really do not like.
> 
> But it looks like browsers send POST data in the same encoding as the
> form was received as. So if the form is received as UTF-8 encoded,
> browsers send back encoded in UTF-8.
> 
> Now, how does Sling know what encoding has been used to send the form ?
> Short answer: It cannot know.
> 
> Hence the _charset_ request parameter.
> 
> But listening to our clients and users and understanding that most of
> the time UTF-8 is used anyway, how about this solution:
> 
>   * We stick with the _charset_ parameter. Whatever that parameter
>     conveys is used to decode parameters.
>   * If the parameter does not exist, we support a new configuration
>     option defining the default encoding to be used.
>   * If the configuration option is also missing, we default to the
>     same value as we do today; which is ISO-8859-1
> 
> Of course the configuration option would not be set by default (for
> backwards compatibility reasons).
> 
> Would that help your case ?
> 
> Regards
> Felix
> 
> Am Mittwoch, den 20.10.2010, 14:05 -0400 schrieb sam lee: 
> > according to:
> > http://download.oracle.com/javaee/6/api/javax/servlet/ServletRequest.html#getCharacterEncoding%28%29
> > request.getCharacterEncoding() should return " the name of the character
> > encoding used in the body of this request. ".
> > 
> > But request.getCharacterEncoding() always seems to return  ISO-8859-1.
> > For example, my html.jsp looks like:
> > <%@ page language="java" contentType="text/html; charset=UTF-8"
> >     pageEncoding="UTF-8"%>
> > ...
> > <form method="POST" action="/some/path"
> >     accept-charset="utf-8"
> >     enctype="application/x-www-form-urlencoded; charset=utf-8">
> >     <input type="hidden" name="_charset_" value="UTF-8" />
> >     <input type="submit" value="Save" />
> > ...
> > 
> > Then I would expect request.getCharacterEncoding()  (from POST.jsp) to
> > return "UTF-8". But it still returns "ISO-8859-1".
> > 
> > Is this intended?
> > 
> > >From sling documentation:
> > http://sling.apache.org/site/request-parameters.html#RequestParameters-CharacterEncoding
> > I don't get this part:  "This identity transformation happens to generate
> > strings as the original data was generated with ISO-8859-1 encoding."
> > 
> > As long as I set _charset_ to the encoding of the rendered page (with
> > <form>), I don't have a problem. But, I was wondering if
> > .getCharacterEncoding() should be set to whatever request body was encoded
> > as, not what sling used to perform "identity transform" with.
> > 
> > Also, wouldn't it be better if _charset_ is missing from request, it's
> > automatically set to request body encoding? Or, browsers don't send request
> > body encoding information?
> > 
> > Thanks.
> > Sam
> 
> 



Mime
View raw message