tomcat-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eugen Kuleshov <a...@hco.kollegienet.dk>
Subject Re: Proposal: RequestImpl
Date Tue, 02 May 2000 00:23:32 GMT
Costin Manolache wrote:
 
> >  > - I know this is a very important issue - and we need to find a good
> >  > solution, but it's important to do it in a clean way. I can understand
> >  > what happens if I look at the code, but it's not easy ( I'm talking
> >  > about tomcat code, not your code ). If we can factor out the
> >  > encoding/decoding probably everything will be much simpler.
> >
> > What's about the solution everybody agreed in servlet-interest? We should
> > not guess the charset at all, leaving it to the servlet developer or custom
> > software to determine the encoding. We can never actually tell it with 100%
> > accuracy. But there should be a .setCharacterEncoding() method in place for
> > the ServletRequest, and the parsers should take care of the encoding set
> > with this method. Speaking of the implementation, I'm agree with you, the
> > stream approach is not the most efficient solution, but it could be easily
> > done using the single byte[] buffer.
> 
> We need to implement getReader() anyway - can't get around that.
> We also need to at least respect the encoding if it is specified as part of
> the
> POST method - getParameters() must use the right encoding if specified.

  But it can be wrong in some cases. Anyway getParameters() for POST and
for GET too should use encoding setted by servlet developer
(request.setCharacterEncoding( String enc) ).
 
> Unfortunately there is not .setChareacterEncoding method in SerlvetRequest,

  but we still hope this will be added in JSDK 2.3

> or getReader( encoding ) - so there is little we can do about that.

  this is not necessary. It should use encoding from
.setChareacterEncoding

> Even if next Servlet API will have those methods, we still need an efficient
> way to implement encoding/decoding ( for example it is recomended to use
> Reader/Writer - that means at least UTF to ASCII conversion ).

  btw ASCII is a 7bit encoding. But there are lot of 8bit encodings...
like koi8r, windows1252, windows1251 et cetera...
 
> If we find a good way to "guess" ( even if it's complex - as long as we can
> keep it modular ) - I see no reason not to implement it. HTTP is supposed
> to be international, and in time ( I hope ) the browsers will have fewer
> bugs ( and use UTF ? ), eliminating the complex encoding code.
> 
> I find this a very difficult problem, and I spent some time on this - asking
> servlet/JSP developers to deal with charsets  will not be easy.

  It should be solved in JSDK but not in reference inmplementation.
  
  Now lot of russian developers use own RequestWrappers with fixing JSDK
and reference implementation problems. It's not so good.

  Eugen Kuleshov.

Mime
View raw message