Konstantin Kolinko wrote:
..
>
> 2011/5/9 André Warnier <aw@ice-sa.com>:
>> (like a space encoded as a "+", and a "+"
>> encoded as %xy),
>
> Andre, one small correction:
> It sometimes causes confusion, but encoding of space as '+' works only
> in the query part of the URL.
> The unambiguous way to encode a space regardless of is position in URL is %20.
>
> Encoding space as '+' is defined by "url encoding" encoding scheme
> defined by HTML standard, in the chapter where it describes how HTML
> forms are submitted.
>
Agreed, my mistake.
Also, in the query string part, an unencoded ";" could be taken as a query parameter
separator, no ? (an alternative to "&").
But I forget what RFC that is, if any.
Now one additional comment. You said :
..
> about SEOability and user-friendliness - this especially concerns path
> > with international characters in URLs, e.g. http://site/pathąčęė
That is up to the browser how to show those URLs. Many browsers have a
setting how to display such URLs. E.g. try to browse non-English
Wikipedia for an example of i18n addresses.
..
I think that the above is a bit confusing.
The "site" (or hostname) part of the URL is submitted to a different encoding than the
path part (/pathąčęė). The path part must be URL-encoded, but for the hostname part,
what
is used is "punycode", see http://en.wikipedia.org/wiki/Punycode.
Just another example of the current mess with character sets and encodings...
I guess one has to have a first or last name containing so-called "diacritic" characters
to really appreciate these issues.
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org
|