lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hoss Man (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SOLR-3823) Parentheses in a boost query cause errors
Date Wed, 12 Sep 2012 16:38:07 GMT

    [ https://issues.apache.org/jira/browse/SOLR-3823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13454104#comment-13454104
] 

Hoss Man commented on SOLR-3823:
--------------------------------

bq. I couldn't tell that this issue was an actual problem or a case of users putting whitespace
where it doesn't belong and prior versions being more forgiving.

James: the core of the bug was your use of SolrPluginUtils.parseFieldBoosts to try and parse
the bq params.

This is not safe -- if you look at the method it is an extremely trivial utility that is specific
for parsing qf/pf style strings containing a list of field names and boosts.  it's _not_ a
safe way to parse an arbitrary query string, and any non trivial query string can cause problems
with it.

AS you noted in SOLR-3278, parseFieldBoosts is used for parsing the "bf" param and that's
actually a long standing unsafe bug as well (SOLR-2014) but since functions tend to be much
simpler, it's historically been less problematic.  when people run into problems with it,
the workarround is to use "bq={!func}..." instead.

bq. I would rather get a fix that preserves the negative boost support

Since SOLR-3278 had not been released publicly outside of the ALPHA/BETA, my first priority
was to fix the regression compared to 3.x where non-trivial bq queries worked fine.

The documented method of dealing with "negative boosting" in solr is actually the type of
query that was the crux of this bug report, and i updated the tests you added in SOLR-3278
to use that pattern...

https://wiki.apache.org/solr/SolrRelevancyFAQ#How_do_I_give_a_negative_.28or_very_low.29_boost_to_documents_that_match_a_query.3F

I have no objections to supporting "true" negative boosts, but i think the right way to do
it is in the query parsers / QParsers themselves (so that the boosts can be on any clause)
and not just as a special hack for bq/bf (the fact that it works in bf is actualy just a fluke
of it's buggy implementation) but as you can see in LUCENE-4378 this is a contentious idea.


                
> Parentheses in a boost query cause errors
> -----------------------------------------
>
>                 Key: SOLR-3823
>                 URL: https://issues.apache.org/jira/browse/SOLR-3823
>             Project: Solr
>          Issue Type: Bug
>          Components: query parsers
>    Affects Versions: 4.0-BETA
>         Environment: Mac, jdk 1.6, Chrome
>            Reporter: Mathos Marcer
>            Assignee: Hoss Man
>             Fix For: 4.0, 5.0
>
>
> When using a boost query (bq) that contains a parentheses (like this example from the
Relevancy Cookbook section of the wiki):
> {noformat}
>  ? defType = dismax 
>     & q = foo bar 
>     & bq = (*:* -xxx)^999 
> {noformat}
> You get the following error:
> org.apache.lucene.queryparser.classic.ParseException: Cannot parse '-xxx)': Encountered
" ")" ") "" at line 1, column 12. Was expecting one of: <EOF> <AND> ... <OR>
... <NOT> ... "+" ... "-" ... <BAREOPER> ... "(" ... "*" ... "^" ... <QUOTED>
... <TERM> ... <FUZZY_SLOP> ... <PREFIXTERM> ... <WILDTERM> ... <REGEXPTERM>
... "[" ... "{" ... <NUMBER> ...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message