esme-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vassil Dichev <vdic...@apache.org>
Subject Re: ESME-26 The message parser should ignore # in urls
Date Tue, 27 Oct 2009 22:31:22 GMT
I think ESME-26 is resolved. Here's the comment I've included:

----------------------------------------------------------------
The problem with URL parsing is that it's based on RFC 1738, which
does not specify that a hash sign (#) is part of the URL. This is
resolved in RFC 1630, which defines the so called "fragment id", which
is used do specify anchors.

The naive approach of using '#' as a valid symbol for the hsegment
will produce some invalid urls, which seem to contain multiple
concatenated anchors:

http://test.url/segment#anchor1#anchor2

Not only that, but including the hash sign (#) only for hsegment
parses as invalid URLs which have a query string before the anchor,
like the one Dick has pasted.
----------------------------------------------------------------

The problem with displaying chinese characters is different. I believe
the characters are stored correctly, but are displayed in a different
encoding, which Xuefeng might not expect (try changing the encoding in
the browser). This means the URLs will work, but you might not always
see them as expected.

Mime
View raw message