lucene-solr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Hostetter <>
Subject Re: resin and UTF-8 in URLs
Date Sat, 03 Feb 2007 05:49:26 GMT

: For XML, I think trusting the XML parser, and not the servlet
: container is a better way to go.
: That means handing the XML parser an InputStream instead of a Reader.

you mean if there is no charset in the content-type? ... yeah, that was
what i (think i) was suggesting as far as XML goes, trust the user.

: There *is* one place I think we should use UTF-8 when there isn't a
: charset specified:
: a POST with "Content-Type: application/x-www-form-urlencoded".
: a) You can't get browsers to put a charset there.
: b) Browsers by default encode the form data in the charset of the form.
: c) We know more than the servlet container in this instance... we know
: at least that
:    our admin pages use UTF-8, and that a POST coming from them will be UTF-8.

Hmmm ... okay i guess i can get behind that.  Can we at least agree that
if the client *does* specify a charset in the content-type header we'll
use it? ... browsers may not be doing it, but client libraries can.


View raw message