lucene-solr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Hostetter <>
Subject Re: charset in POST from browser
Date Thu, 01 Feb 2007 08:18:49 GMT

: Other things might use POST for querying though.  Perhaps they can all
: set a charset while doing so.

well, i can think of a couple of scenerios...

1) POST multipart/* to either /select or the new style URLs ...  the
browsers should put a content-type with a charset on each part; the
ContentStream parsing code Ryan wrote should do the right thing, we only
have to rely on the Servlet Container to do the right thing for the parts
containing servlet request params -- hopefully they use the charset

2) POST application/x-www-form-urlencoded to new style urls ... see below.

3) POST anything else to the new style urls ... parsed as a raw
ContentStream, charset taken from the content-type -- should work fine.

4) POST application/x-www-form-urlencoded to the current /select ... see

5) POST */* to the /update ... it currently ignores content type and
assumes UTF-8 regardless of servlet container config ... we could
theoretically make it look at the content-type only for the charset and
still ignore the meat of the content-type.

6) GET anything ... see below.

"see below" is a situations where i don't think we can gleam anything from
the request itself -- we have to make an assumption based on config.  for
#2 and #4 we could concievable have a solrconfig.xml option indicating
what charset Solr should assume, and then we can (aparently) use
HttpServletRequest.setCharacterEncoding to specify that's the charset we
want the servlet container to use when parsing the input -- but i don't
think this helps case #6 -- i can't find any portable way to tell the
servlet container how to parse the URL, so if we have to rely on
documentation to instruct people on how to deal with that, we might as
well do the same thing for #2 and #4 (let it be in the servlet container
config instead of hte solrconfig)

(we should of course test all of these scenerios ... i'm just guessing #1,
#3 and #5 all work okay)


View raw message