lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joshua O'Madadhain <jmad...@ics.uci.edu>
Subject Re: Lucene Query Structure
Date Tue, 19 Feb 2002 08:22:58 GMT
Actually, Winton's suggestion doesn't work because it's inconsistent with
the syntax of BooleanQuery() (the constructor doesn't take arguments, and
add() takes one Query argument, not two).

After considerable study of the documentation, I am still confused about
the semantics of BooleanQuery.  I think I can answer the original
syntactic question posed by sjb, but the overall motivation escapes me.

I believe the correct syntax is (given TermQueries a, b, c, and d):

// create a BooleanQuery for '(a AND b)'
BooleanQuery bq_ab = new BooleanQuery();
bq_ab.add(a, true, false);
bq_ab.add(b, true, false);

// same as above with c and d
BooleanQuery bq_cd = new BooleanQuery();
bq_cd.add(c, true, false);
bq_cd.add(d, true, false);

// join two BooleanQueries together
BooleanQuery bq_abcd = new BooleanQuery();
bq_abcd.add(bq_ab, false, false);
bq_abcd.add(bq_cd, false, false);


Now, as sjb pointed out, "(query, false, false)" doesn't really seem to
have the semantics of a boolean OR.  In particular:

(1) It's a unary operator: add() adds a Query (or a BooleanClause) to a
BooleanQuery.  OR is a binary operator.

(2) "add.(query, false, false)" adds a query Q whose satisfaction is
*irrelevant* to that of the resultant BooleanQuery BQ: documents are
neither required to satisfy Q, nor required to *not* satisfy Q, in order
for BQ to be satisfied.  The semantics of a boolean OR should be that at
least *one* of the components (queries) must be satisfied in order for the
entire expression (composite query) to be satisfied.


I conclude that either 
(a) I simply don't understand the proper use of BooleanQuery, or 
(b) BooleanQuery cannot be used to express a boolean OR.  

If (b) is true, either the semantics of BooleanQuery need revising, or
it needs to get called something else so as not to be confusing.  

If the semantics of BooleanQuery are revised, I would suggest changing the
syntax as well to reflect the binary nature of Boolean operators:

BooleanQuery.add(Query q1, Query q2, boolean and)

which would equate to (q1 AND q2) if 'and' is true, otherwise (q1 OR q2).


If there are any papers or references which would explain this better,
that would be great.  Otherwise, I would really appreciate it if one of
the developers would take a few moments to clarify this issue.  I'm trying
to use Lucene as a platform for research in IR; to do this I need a clear
understanding of the exact definition of the scoring system, especially as
it relates to the semantics of queries involving multiple terms.

Once I get a clear understanding of this issue, I would be happy to write
it up and submit it as an addition to the FAQ/docs.

Thanks in advance for any assistance rendered in getting this sorted out.

Regards,

Joshua

On Mon, 18 Feb 2002, Winton Davies wrote:

> BQ(Term, Term, Include, Exclude)
> 
> BQ(
>   BQ(a,b,true,false)
>   BQ(a,b,true,false)
>   false,
>   false)
> 
> Should work...
>   Winton

> >Lets say I have two queries which I want to combine into one:
> >
> >(a and b) OR (c and d) 
> >
> >I would use QueryParser.parse to form the subqueries, but how do I/can I
> >combine them with the OR logic?
> >
> >The BooleanQuery can be used to piece queries together but it does not
> >support OR's correct?  (It only supports include, exclude, and 
> >"preferential".)
> >
> >Thanks for any help,
> >
> >sjb



 jmadden@ics.uci.edu...Obscurium Per Obscurius...www.ics.uci.edu/~jmadden
    Joshua Madden: Information Scientist, Musician, Philosopher-At-Tall
 It's that moment of dawning comprehension that I live for--Bill Watterson
My opinions are too rational and insightful to be those of any organization.



--
To unsubscribe, e-mail:   <mailto:lucene-dev-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-dev-help@jakarta.apache.org>


Mime
View raw message