lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Otis Gospodnetic <otis.gospodne...@gmail.com>
Subject Re: POST query with non-ASCII to solr using httpclient wont work
Date Sun, 13 Jan 2013 05:09:39 GMT
Jie Sun,

Just use solrj :)

Otis
Solr & ElasticSearch Support
http://sematext.com/
On Jan 12, 2013 7:40 PM, "Jie Sun" <jsun5555@yahoo.com> wrote:

> When I use HttpClient and its PostMethod to post a query with some Chinese,
> solr fails returning any record, or return everything.
>             ... ...
>             method = new PostMethod(solrReq);
>             method.getParams().setContentCharset("UTF-8");
>             method.setRequestHeader("Content-Type",
> "application/x-www-form-urlencoded; charset=UTF-8");
>             ... ...
>
> I used tcp dump and found out the query my application above sent is an
> urlencoded query string to solr (see the "q=xxx" part):
>
> ../....SPOST /solr/413/select HTTP/1.1
> Content-Type: application/x-www-form-urlencoded; charset=UTF-8
> Accept: */*
> User-Agent: Jakarta Commons-HttpClient/3.1
> Host: 172.20.73.142:8080
> Content-Length: 192
>
>
> q=type%3Amessage+AND+customer_id%3A413+AND+subject_zhs%3A%E8%83%BD%E5%8A%9B+&hl.fl=&qt=standard&wt=standard&rows=20
> 17:09:55.592527 IP xxx> yyy.webcache: tcp 0
> ... ...
>
> I found this urlencoding is what causing solr query failing. I found this
> by
> copying the above urlencoded query to a file and use curl command, then I
> got same error, but if I replace the above query with decoded string, then
> it works with solr:
>
> curl -v -H 'Content-type:application/x-www-form-urlencoded; charset=utf-8'
> http://localhost:8080/solr/413/select --data @/tmp/chinese_query
>
> when /tmp/chinese_query has following it works with solr:
>
> q=type:message+AND+customer_id:413+AND+subject_zhs:能力+&hl.fl=&qt=standard&wt=standard&rows=20
>
> But if I switched the /tmp/chinese_query  to use urlencoded string, it
> fails
> again with same error:
>
> q=type%3Amessage+AND+customer_id%3A413+AND+subject_zhs%3A%E8%83%BD%E5%8A%9B+&hl.fl=&qt=standard&wt=standard&rows=20
>
> So, my conclusion:
> 1) solr (I am using 3.5) only accept decoded query string, it fails with
> url
> encoded query
> 2) httpclient will send out urlencoded string no matter what (there is no
> way seems to me to make it sends out request in POST without urlencoding
> the
> body).
>
> am I missing something, or do you have any suggestion what I am doing
> wrong?
> thanks
> Jie
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/POST-query-with-non-ASCII-to-solr-using-httpclient-wont-work-tp4032957.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message