tomcat-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Greg.C...@pfizer.com
Subject RE: Intro/question possible buglet with Content-Type and Charsets - now more of an RFC
Date Thu, 18 Dec 2003 16:44:18 GMT
Hi All,

(We may be barking up the wrong tree here, so if so please point me in the
right direction)

This is still causing us issues - as IE fails to parse a charset when it is
tacked on to Content-Type: application/vnd.ms-excel

It would appear that the charset is being tacked onto the Content-Type in
setContentType method of
catalina/src/share/org/apache/catalina/connector/ResponseBase.java in the
event of it not being supplied in the Content-Type (it looks for a ';')

The encoding can never be null as it is extracted from the locale in the
setLocale method below it.

I understand this to mean that the charset will always be tacked on
irrespective of the Type.

However;

I cannot find an explicit reference to not defining a charset for binary
Types, but I cannot see why you would want to.

HTTP 1.1 implies that there is a default charset for text Types (makes
sense)(http://www.w3.org/Protocols/rfc2068/rfc2068)

'When no explicit charset parameter is provided by the sender, media
subtypes of the "text" type are defined to have a default charset value of
"ISO-8859-1"' 

Which I understand that it is fair enough to add it to text/* Types.

RFC 1341 (http://www.faqs.org/rfcs/rfc1341.html) states that:

'2.a.  A "text" Content-Type value, which can be used to represent  textual
information  in  a  number  of character  sets  and  formatted  text
description languages in a standardized manner.'

But no mention of Charsets in Application types:

'2.c.  An "application" Content-Type value, which can be used  to transmit
application data or binary data, and hence,  among  other  uses,  to
implement  an electronic mail file transfer service.

What I would suggest is a little if wrapper to only add a default if the
Content-Type is text/

A sudo code below (not tested)

###########
catalina/src/share/org/apache/catalina/connector/ResponseBase.java

 public void setContentType(String type) {

        if (isCommitted())
            return;

        if (included)
            return;     // Ignore any call from an included servlet

        this.contentType = type;
        if (type.indexOf(';') >= 0) {
            encoding = RequestUtil.parseCharacterEncoding(type);
            if (encoding == null)
                encoding = "ISO-8859-1";
        } else {
            if (encoding != null && type.startsWith('text/'))
                this.contentType = type + ";charset=" + encoding;
        }

    }

Regards,

Greg


> -----Original Message-----
> From: Tim Funk [mailto:funkman@joedog.org]
> Sent: 16 December 2003 18:09
> To: Tomcat Developers List
> Subject: Re: Intro/question possible buglet with Content-Type and
> Charsets .
> 
> 
> Yeah, nagoya.apache.org seems down. Hopefully it will be back 
> soon. The bug 
> has good detail of what and how to fix.
> 
> -Tim
> 
> Greg.Cope@pfizer.com wrote:
> 
> > Thanks Tim,
> > 
> > Having a little trouble getting anything from bugzilla, 
> nagoya.apache.org
> > seems to be having a little trouble!
> > 
> > Looking in the archives for this id, I see that someone has 
> a 4.1.29 patch
> > and a complied class, but cannot see either email address 
> or content via the
> > archive.
> > 
> > Ho hum....
> > 
> > Thanks for the pointer.
> > 
> > Greg
> > 
> > 
> > 
> > 
> >>-----Original Message-----
> >>From: Tim Funk [mailto:funkman@joedog.org]
> >>Sent: 16 December 2003 12:31
> >>To: Tomcat Developers List
> >>Subject: Re: Intro/question possible buglet with Content-Type and
> >>Charsets.
> >>
> >>
> >>http://nagoya.apache.org/bugzilla/show_bug.cgi?id=24970
> >>
> >>Greg.Cope@pfizer.com wrote:
> >>
> >>>Hi All,
> >>>
> >>>Quick intro, and then a question;
> >>>
> >>>We use tomcat to host java web applications at our 
> >>
> >>location.  My client
> >>
> >>>requires us to follow very strict rules for deploying 
> >>
> >>software, that means
> >>
> >>>it can be a documentation intensive process (evidence 
> >>
> >>gathering/ IQP's etc
> >>
> >>>....).  So we rarely upgrade as it is quite allot of 
> >>
> >>work..... Luckily
> >>
> >>>tomcat is excellent and rarely needs upgrading or patching.
> >>>
> >>>Now the question;
> >>>
> >>>Tomcat 4.1.29 seems to insist on added charset to the 
> >>
> >>content type, even if
> >>
> >>>a Content-Type has been set using response.setContentType 
> or similar
> >>>(without a charset).  Tomcat 5 seems to do something 
> >>
> >>similar judging from
> >>
> >>http://www.mail-archive.com/tomcat-dev@jakarta.apache.org/msg4
> >>9015.html but
> >>
> >>>I think it fails to check if the Content type is a text one 
> >>
> >>(HTML) and adds
> >>
> >>>it for any content type, which would appear not to be right IMHO.
> >>>
> >>>Without wishing to appear rude :-) I need to change this 
> >>
> >>behaviour and
> >>
> >>>remove the insertion of the charset for non text based 
> >>
> >>Content-Types  eg:
> >>
> >>>application/vnd.ms-excel
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: tomcat-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: tomcat-dev-help@jakarta.apache.org
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: tomcat-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: tomcat-dev-help@jakarta.apache.org


Mime
View raw message