lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doug Cutting <DCutt...@grandcentral.com>
Subject RE: Lucene Query Structure
Date Tue, 19 Feb 2002 16:04:46 GMT
> From: Joshua O'Madadhain [mailto:jmadden@ics.uci.edu]
> 
> After considerable study of the documentation, I am still 
> confused about the semantics of BooleanQuery.
> 
> Now, as sjb pointed out, "(query, false, false)" doesn't 
> really seem to have the semantics of a boolean OR.

In fact, it does.

> In particular:
> 
> (1) It's a unary operator: add() adds a Query (or a 
> BooleanClause) to a BooleanQuery.  OR is a binary operator.

OR is not inherently binary.

> The semantics of a boolean OR should 
> be that at
> least *one* of the components (queries) must be satisfied in 
> order for the
> entire expression (composite query) to be satisfied.

That is the case here.

> I conclude that either 
> (a) I simply don't understand the proper use of BooleanQuery, or 
> (b) BooleanQuery cannot be used to express a boolean OR.  

(a)

Good analogies for the semantics of BooleanQuery are most internet search
engines (except Google) which permit you to put '+' or '-' in front of a
word to require or prohibit it.  (Google requires terms by default.)  A term
with no plus or minus is not required for a match, but all of the documents
containing it are included.

This form also facilitates efficient search.  One can keep track of which
query terms a document contains as a bitmask, then check the boolean
expression against this mask.  This is fast if the boolean expression can
itself be expressed as a few simple bitmasks.  Converting an arbitrary
boolean expression to disjunctive normal form is possible, but not trivial.
But with this form mask construction is trivial.

Doug

--
To unsubscribe, e-mail:   <mailto:lucene-dev-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-dev-help@jakarta.apache.org>


Mime
View raw message