lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Rochkind <rochk...@jhu.edu>
Subject Re: lucene parser, negative OR operands
Date Wed, 18 May 2011 15:08:58 GMT
On 5/17/2011 8:00 PM, Yonik Seeley wrote:
> This doesn't have to do with Solr's support of pure-negative top-level
> queries, but does have to do with
> a long standing confusion of how the lucene queryparser works with
> some of the operators (i.e. not really boolean logic).
>
> In a Lucene BooleanQuery, clauses are mandatory, optional, or prohibited.
> -foo OR -bar actually parses to a boolean query with two prohibited
> clauses... essentially the
> same as -foo AND -bar.  You can see this by adding debugQuery=true to
> the request.

Thanks Yonik. I recall hearing about this before, but was vague on the 
details, thanks for supplying some and refreshing my memory.

So I guess there is no such thing as an "optional prohibited" clause.  
Which is what makes "-one OR -two" the same thing as "-one AND -two".  
Actually, yeah, an "optional prohibited" clause doesn't reallly even 
make sense. Hmm.

If I want to understand more about how the lucene query parser does it's 
thing, can anyone suggest the source files I should be looking at?

If I really do want actual boolean logic behavior, what are my options?  
I guess one is trying to write my own query parser.

Hmm, for that particular query, what about using parens to force a 
sub-query?

(-one) OR (-two)

Ha, nope, that runs into a different problem (or is it the same 
problem?), and always returns 0 hits.  It looks like the lucene query 
parser can't handle a pure-negative sub-query like that seperate by OR?  
Not sure why, can anyone explain that one?

For that particular pattern, this crazy refactoring of the query does 
work and get the actual boolean logic result of "(not 'one') OR (not 
'two')":

(*:* AND -one) OR (*:* AND -two)

Phew, crazy stuff. So that's a weird solution to getting actual boolean 
logic behavior for that pattern, but in general, I'm kind of wanting a 
parser that will give actual boolean logic behavior. Maybe someday I can 
find time to write it in Java (not the quickest thing for me, not 
familiar with the code at all).

Jonathan


Mime
View raw message