lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Brian Carver (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SOLR-2649) MM ignored in edismax queries with operators
Date Sat, 04 Feb 2012 16:52:53 GMT

    [ https://issues.apache.org/jira/browse/SOLR-2649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13200476#comment-13200476
] 

Brian Carver commented on SOLR-2649:
------------------------------------

I'm new to solr, so I have a tenuous grasp on some of these issues, but I've understood boolean
logic for a couple of decades and it seems to me like solr's current behavior is thwarting
the expectations of those who understand what they want and explicitly ask for it. Mike's
example above is what troubles me.

Principles:
1. The maintainer sets whitespace to be interpreted as AND or OR and solr should do nothing
to change that in particular instances.
2. Where a user inputs an ambiguous query, a default rule about how operator scope will work
is needed and that also should not be changed in particular instances.

So, Mike says he sets whitespace to AND, users know this, and then a user enters:

Example 1: (A or B or C) "D E"

Given the above assumptions, the only reasonable interpretation of this is:

(A or B or C) AND "D E" which is a conjunction with two conjuncts, both of which must be satisfied
for a result to be produced, yet Mike/the user gets results that only satisfy one of the conjuncts.
That shouldn't happen.

I'd agree though that how to understand/apply mm in some of the examples above creates hard
questions, but that is why many search engines provide two interfaces, one "natural language"
interface and one that requires strict use of boolean syntax. Allowing people to enter some
boolean operators (which they're going to expect will be respected-no-matter-what) and simultaneously
interpreting their query using mm handlers intended for a more rough-and-ready approach is
just going to lead to confused end users most of the time. So, in some ways, ignoring mm when
operators are used is a feature, not a bug, but that seems orthogonal to the completely unacceptable
outcome Mike described: whatever is causing THAT, is a bug.
                
> MM ignored in edismax queries with operators
> --------------------------------------------
>
>                 Key: SOLR-2649
>                 URL: https://issues.apache.org/jira/browse/SOLR-2649
>             Project: Solr
>          Issue Type: Bug
>          Components: search
>    Affects Versions: 3.3
>            Reporter: Magnus Bergmark
>            Priority: Minor
>
> Hypothetical scenario:
>   1. User searches for "stocks oil gold" with MM set to "50%"
>   2. User adds "-stockings" to the query: "stocks oil gold -stockings"
>   3. User gets no hits since MM was ignored and all terms where AND-ed together
> The behavior seems to be intentional, although the reason why is never explained:
>   // For correct lucene queries, turn off mm processing if there
>   // were explicit operators (except for AND).
>   boolean doMinMatched = (numOR + numNOT + numPluses + numMinuses) == 0; 
> (lines 232-234 taken from tags/lucene_solr_3_3/solr/src/java/org/apache/solr/search/ExtendedDismaxQParserPlugin.java)
> This makes edismax unsuitable as an replacement to dismax; mm is one of the primary features
of dismax.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message