lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yonik Seeley (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SOLR-2456) Filter queries of values with + sign not decoded correctly
Date Mon, 04 Apr 2011 22:27:05 GMT

    [ https://issues.apache.org/jira/browse/SOLR-2456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13015663#comment-13015663
] 

Yonik Seeley commented on SOLR-2456:
------------------------------------

There's a lot of different things going on here.  First, let's focus on lucene query syntax
and forget about HTTP URL encoding (that's just transfer syntax stuff).

A lucene query of
required_experience:1 to 2 Years
is really equivalent to
required_experience:1 default_field:to default_field:2 default_field:Years

Next, URL encoding is normally an implementation detail.  If Solr started supporting some
other transport such as Thrift, there would be no %2b at all.  When the servlet container
sees a %2b, it translates it into a "+" before Solr get's it.

There are certain query parsers (qparsers) specifically designed to help out at the lucene
syntax level (so you don't have to deal with escaping special query parser chars, double quotes,
etc.
http://wiki.apache.org/solr/SolrQuerySyntax

Since you're on 4.0-dev, I'd recommend using the "term" qparser for this:

fq={!term f=required_experience}10+ Years

The benefit is that at the lucene syntax level, there is no escaping whatsoever needed when
appending the value you are filtering on.

Now, for the HTTP layer, clients normally take care of the required escaping.  But if you're
using something low-level like curl that does not do it for you, then it would look like:

fq={!term%20f=required_experience}10%2b%20Years


> Filter queries of values with + sign not decoded correctly
> ----------------------------------------------------------
>
>                 Key: SOLR-2456
>                 URL: https://issues.apache.org/jira/browse/SOLR-2456
>             Project: Solr
>          Issue Type: Bug
>          Components: search
>    Affects Versions: 4.0
>            Reporter: Scott Kister
>            Priority: Minor
>
> Querying by filters with values containing a + sign does not work as expected. Querying
by the quoted value fails. Escaping the + and space without quotes also fails. I did finally
get a query to work, but it involved both quoting the value and escaping the +, but not the
space.
> Either quoting the value, or escaping should work.
> To reproduce, create a test collection with two documents.
>   "response":{"numFound":2,"start":0,"docs":[{
>         "listing_id":2483808693,
>         "required_experience":["10+ Years"]},{
>         "listing_id":2484835296,
>         "required_experience":["1 to 2 Years"]}]
> These all return 0 results, I believe the first 4 should work.
> ?fq=required_experience:1+to+2+Years
> ?fq=required_experience:1%20to%202%20Years
> ?fq=required_experience:10%2B%20Years
> ?fq=required_experience:"10+ Years"
> ?fq=required_experience:10\+\ Years
> These do work, the second one should not work since %2B is quoted and should not then
be urldecoded.
> ?fq=required_experience:"1 to 2 Years"
> ?fq=required_experience:"10%2B Years"
> I tested with the most recent build, apache-solr-4.0-2011-04-01_08-37-23.tgz
> schema.xml for required_experience is
>     <fieldType name="string" class="solr.StrField" sortMissingLast="true" omitNorms="true"/>
>    <field name="required_experience" type="string" indexed="true" />

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message