lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hoss Man (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SOLR-2649) MM ignored in edismax queries with operators
Date Fri, 03 Feb 2012 22:20:51 GMT

    [ https://issues.apache.org/jira/browse/SOLR-2649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13200102#comment-13200102
] 

Hoss Man commented on SOLR-2649:
--------------------------------

bq. Counting multiple terms as 1 because they are in parenthesis together doesn't seem like
a good idea to me.

I disagree, but it definitely just seems like a matter of opinion -- i don't know that we
could ever come up with something that makes sense in all use cases

personally i think the sanest change would be to say that "mm" applies to all top level SHOULD
clauses in the query (regardless of wether they have an explicit OR or not) -- exactly as
it always has in dismax.  if a top level clause is a nested boolean queries, then "mm" shouldn't
apply to those because it doesn't make sense to blur the "count" of how many SHOULD clauses
there are at the various levels.

would would mm=5 mean for a query like "q=X AND Y (a b) (c d) (e f) (g h)" if you looked at
all the nested subqueries?  that only 5 of those 8 (lowercase) leaf level clauses are required?
 how would that be implemented on the underlying BooleanQuery objects w/o completely flattening
the query (which would break the intent of the user when they grouped them) ... it seems like
mm=5 (or mm=100%) should mean 5 (or 100%) of the top level SHOULD clauses are required ...
the default query op should determine how any top level clauses that are BooleanQueries are
dealt with.

...but that's just my opinion.  


                
> MM ignored in edismax queries with operators
> --------------------------------------------
>
>                 Key: SOLR-2649
>                 URL: https://issues.apache.org/jira/browse/SOLR-2649
>             Project: Solr
>          Issue Type: Bug
>          Components: search
>    Affects Versions: 3.3
>            Reporter: Magnus Bergmark
>            Priority: Minor
>
> Hypothetical scenario:
>   1. User searches for "stocks oil gold" with MM set to "50%"
>   2. User adds "-stockings" to the query: "stocks oil gold -stockings"
>   3. User gets no hits since MM was ignored and all terms where AND-ed together
> The behavior seems to be intentional, although the reason why is never explained:
>   // For correct lucene queries, turn off mm processing if there
>   // were explicit operators (except for AND).
>   boolean doMinMatched = (numOR + numNOT + numPluses + numMinuses) == 0; 
> (lines 232-234 taken from tags/lucene_solr_3_3/solr/src/java/org/apache/solr/search/ExtendedDismaxQParserPlugin.java)
> This makes edismax unsuitable as an replacement to dismax; mm is one of the primary features
of dismax.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message