lucene-solr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Hostetter <hossman_luc...@fucit.org>
Subject Re: charset in POST from browser
Date Thu, 01 Feb 2007 06:24:38 GMT

: >    Content-type: application/x-www-form-urlencoded; charset=utf-8
: >
: > ...picking the charset based on the charset of the page containing the
: > form  (i assume you tested and verified this isn't happening?)
:
: Yep, FireFox2.
: I'd serve the page, do a search, kill the solr server, run nc -l -p
: 8983, and run the search again.  The body was encoded correctly, but
: just no charset info.

yeah ... the google cache of
"ppewww.physics.gla.ac.uk/~flavell/charset/form-i18n.html" (URL
currently 403) suggests that browsers don't do this because a lot of old
CGI parsing libraries can't handle it.  RFC2070 section 5.2 suggests that
this is one method that can be used -- but says "The best solution is to
use the "multipart/form-data" media type" ... perhaps if we change the
forms to use that explicitly things would work.

acctually ... all of the existing forms we have are GET -- so it's kind of
a moot issue isn't it?  (i see there's a seperate thread about
resin and UTF-8 in URLs - multipart/form-data wouldn't relaly help in thta
case.


Did you see my other comments from what seemed to be a resin FAQ about
that mentioned "The character-encoding tag in the resin.conf." ... it
sounds like that's what we should recomend to people using Resin ... i
suspect they wouldn't even *have* to use UTF-8 .. they just have to set it
to whatever encoding they want to use when POSTing queries.

if setting character-encoding in the <web-app> tag works for URL encoded
values, putting this in the resin.conf will probably work for that too.



-Hoss


Mime
View raw message