tomcat-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Peter J. Farrell" <pe...@mach-ii.com>
Subject Re: request.getPathInfo() gets truncated when ";" is present
Date Fri, 24 Jul 2009 07:58:55 GMT
@All, thanks everybody for responding so far. I apologize for not 
originally including the version of Tomcat we're using (6.0.18).  It was 
an oversight on my part in my hurried effort to write the email -- 
totally just blitzed that I should include that (my first post to the 
Tomcat list).

I realize that I accidentally posted an incorrect example.  The question 
posted by Christopher S. using the URLEncoded value of %3B is the 
correct encoding.  I shouldn't have pasted in a plain text ";".

This ";" is an edge case for us.  This URL encoding is part of a SES / 
Friendly URL implementation of a framework I'm a contributor for and 
therefore this was logged as defect report by a framework user since 
originally we were not URL encoding special characters (they were 
including a short "error" message in the URL proper).  We have no clue 
what people are generating and how they using as URLs -- just trying to 
get an idea of why this wasn't working.

@Christopher S., Sadly Tomcat still truncates the path info even when 
encoding the the ";" as %3B.  We've actually resorted to using "unicode" 
like representation (changing "U+03B" to U_03B so to not use a +) and 
transforming back when processing the incoming requests.

@Chuck, thanks for your snarky response. I'll leave it at that.

@Bill, thanks for mentioning the specific RFC referencing the encoding 
of ";".  That's what the team here figured out as well and it's nice to 
have an independent person verify our assumptions after reading the RFC 
sections.  Is it safe to assume that Tomcat code base assumes that 
anything after the ";" has to be the jsessionid?  That's our assumption 
at the moment.  Yes, there is a work around by using the 
request.getRequestURI()  instead as that has the whole URI and manually 
remove the "absolute" path to get the complete path info.

Since our framework is deployed on several different CFML servlets -- 
their implementation to get at the original http request wrapper differs 
a bit (three different vendors).  We'll probably stick to use the poor 
man's encoding using a modified unicode representation of ";" in the 
end.  Another solution is to write a filter and use the getRequestURI() 
and replace the bad path info in the request with the full length version.

Thanks again,
.Peter

Bill Barker said the following on 12/23/-28158 01:59 PM:
>> Shouldn't that be /index.cfm/somePathInfo&amp%3BwithMoreInfo/
>>
>> ?
>>
>> If you try the above URL, does it work?
>>
>> java.net.URLEncoder will encode ";" as "%3B".
>>
>> See the URL Specification (RFC 1738,
>> http://www.ietf.org/rfc/rfc1738.txt), section 2.2 "URL Character
>> Encoding Issues
>> ":
>>
>> "
>> Many URL schemes reserve certain characters for a special meaning:
>> their appearance in the scheme-specific part of the URL has a
>> designated semantics. If the character corresponding to an octet is
>> reserved in a scheme, the octet must be encoded.  The characters ";",
>> "/", "?", ":", "@", "=" and "&" are the characters which may be
>> reserved for special meaning within a scheme. No other characters may
>> be reserved within a scheme.
>> "
>>
>> The HTTP specification does not specifically say that semi-colons are
>> reserved, but perhaps the common interpretation of the URL spec is such
>> that semi-colons should always be encoded.
>>
>>     
>
> Actually it does, just by reference.  Section 3.2.1 of RFC 2616 defers to 
> RFC 2396.  And section 3.3 of that RFC gives a special meaning to ';'. 
> Tomcat doesn't handle this correctly according the the RFC, but no 
> developer/contributor has had enough of an itch to fix it.  But I doubt that 
> fixing it would help the OP much.
>
> The fully compliant Tomcat would have to remove anything after a ';' 
> (including the ';') up until the next '/' (if any) for the purpose of 
> mapping the request.  It should then re-include them in the various parts of 
> the request URI (except for ";jsessionid").  So it's a lot of work to 
> implement an archane feature that has plenty of work arounds.
>
>   
>> - -chris
>> -----BEGIN PGP SIGNATURE-----
>> Version: GnuPG v1.4.9 (MingW32)
>> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
>>
>> iEYEARECAAYFAkpozVsACgkQ9CaO5/Lv0PBnLwCfXFSSIDAnRR0BurRKeS0ub/v9
>> 3UYAoJ1gp5oIqnJw2WgHx9LdVzqqAOAI
>> =rpT0
>> -----END PGP SIGNATURE----- 
>>     
>
>
>
>
>   


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message