tomcat-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christopher Schultz <ch...@christopherschultz.net>
Subject Re: Non-US-ASCII letters in url-mapping
Date Thu, 27 Jul 2017 12:34:01 GMT
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

Martin,

On 7/27/17 4:20 AM, Martin Nybo Andersen wrote:
> Hi Christopher,
> 
> On Wed, 26 Jul 2017, Christopher Schultz wrote:
> 
> Martin,
> 
> On 7/26/17 7:08 AM, Martin Nybo Andersen wrote:
>>>> Hi list,
>>>> 
>>>> I've written some servlets with a url-pattern, that includes 
>>>> non-US-ASCII letters (specifically the danish letter æ). This
>>>> works quite fine in tomcat 8.5.14 and previous versions.
>>>> However, I get 404 Not Found with versions 8.5.15 and
>>>> 8.5.16.
>>>> 
>>>> If I get the mappings from the servlet's ServletRegistration,
>>>> then the non-US-ASCII letters are replaced by question
>>>> marks.
> 
> Exactly where are they replaced by question marks? This often
> happens when your system default character set is something like
> ISO-8859-1 and you are looking at a log file which had mangled your
> characters... but the characters being used at runtime are
> correct.
> 
>> My url-mapping is /mælk/data, however, when I call getMappings()
>> I get /m?lk/data. If I set the url-mapping to /m%C3%A6lk/data
>> then getMappings() gives me the correct /mælk/data.

Are you seeing that in a debugger, a log file, etc? Where?

What about the decoded URL as it comes into Tomcat during a request?
Does that look okay to you?

>> My guess is that tomcat supports url-encoded url-mappings by
>> calling UDecoder.URLDecode(). Trouble is that since revision
>> 1793440 the string is treated as a US-ASCII string (see lines 344
>> and 345 of java/org/apache/tomcat/util/buf/UDecoder.java). Before
>> this revision the string was treated as either ISO-8859-1 or the
>> supplied encoding (UTF-8 in my case).

When you say "the supplied encoding" do you mean the declared encoding
of web.xml? Or the declared encoding of the HTTP request? Or some
other encoding?

>> I've created the following bug report: 
>> https://bz.apache.org/bugzilla/show_bug.cgi?id=61351

Okay, we can move the discussion there.

- -chris
-----BEGIN PGP SIGNATURE-----
Comment: GPGTools - http://gpgtools.org
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQIcBAEBCAAGBQJZed25AAoJEBzwKT+lPKRYI3AQALekgUKcb7SfzLXZOrI/KbLY
5OYqx0Wsdy3zSz+ze0/rfPfjQqTdpUVR0N+oCwpCX4Ws1GHck30U396W5k+LvwxV
zj9UjX37SCDhQGtyRCCR9IWBwMdmRh8yESjMDhWdZMukWhPD2wkpcfCkVwGFxGnU
Kht2hI9DP9l+9d6a00ABCgLHsYypZTbBAhYDOtfshENIbo4ZpHkcoR2nJhrUZsIn
LPcXXtYuisGVa6Rn1umeeVf486O0fN7Cd6e5eScIXT5XngXu80Lk5RVl4sYZXTeV
Ceh2qNVod4zyIMt5vpGh2fmhvhlum6b9rgNvw3BYsycLW+aqLTSE+53AbRU0AhEV
hMpoMDPMKbJhtSAXvo3GMo44esbhWvhHQLDTKbFipwWbC1EEmG10ZMiDlpRXdqfy
a/mtgImAMfIg6ujmIQ2kIImFay0T0QdJ4TqTtHkMbAqFAEyWKujBZ9Z4+8ws4mRv
2YYJKyKMamIT/C456mUhTR7fVYyhr4E1ZGjsGISIs1fViFNxFcEo4VSs98vjwvsv
nIPon/97lrrh3v/NldfpnKLchCuuYzsPPykzsVKwVqNp8N4fAf+13RmYCO9xEWGE
ETbi3Rl7Yjfg2dTQ2K0H2sWTAMnh9G3QC/DT4luIRkPL/dExODedbWSo2z2VFSkW
ndhQAS+pZ/vf3PKM5nv2
=NtGe
-----END PGP SIGNATURE-----

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Mime
View raw message