lucene-solr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Schurman (JIRA)" <j...@apache.org>
Subject [jira] Commented: (SOLR-443) POST queries don't declare its charset
Date Mon, 24 Dec 2007 02:18:43 GMT

    [ https://issues.apache.org/jira/browse/SOLR-443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12554211
] 

Andrew Schurman commented on SOLR-443:
--------------------------------------

Hmm... I just tested the latest patch on a different machine with Tomcat 6.0.14 and it does
appear to work (I must have some sort of caching problem on my other machine).

As for standards, I don't believe it's updated, but I found HTML Internationalization RFC
http://www.ietf.org/rfc/rfc2070.txt. On page 16, it mentions that setting the charset with
a content-type of {{x-www-form-urlencoded}} should have the understanding that the "URL encoding
of [RFC1738] is applied on top of the specified character encoding, as a kind of implicit
Content-Transfer-Encoding". In this case, it does seem valid to be setting the charset on
the post.

> POST queries don't declare its charset
> --------------------------------------
>
>                 Key: SOLR-443
>                 URL: https://issues.apache.org/jira/browse/SOLR-443
>             Project: Solr
>          Issue Type: Bug
>          Components: clients - java
>    Affects Versions: 1.2
>         Environment: Tomcat 6.0.14
>            Reporter: Andrew Schurman
>            Priority: Minor
>         Attachments: solr-443.patch, solr-443.patch
>
>
> When sending a query via POST, the content-type is not set. The content charset for the
POST parameters are set, but this only appears to be used for creating the Content-Length
header in the commons library. Since a query is encoded in UTF-8, the http headers should
also specify content type charset.
> On Tomcat, this causes problems when the query string contains non-ascii characters (characters
with accents and such) as it tries to parse the POST body in its default ISO-9886-1. There
appears to be no way to set/change the default encoding for a message body on Tomcat.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message