lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Miller <markrmil...@gmail.com>
Subject Re: Optional terms in BooleanQuery
Date Mon, 21 May 2007 00:48:55 GMT
I like to think of it like this:

Each doc is going to get a score -- if the score is positive the doc 
will be a hit, if the score is 0 the doc will not be a hit.

If a boolean clause is Occur.Must and it is not found, the score will be 
dropped to 0 no matter what (if found, the score is obviously 
increased). If a boolean clause is Occur.Must_Not and is found then the 
score will be dropped to 0 no matter what.
If the boolean query is Occur.Should and it is found a positive number 
is added to the score...if it is not found, nothing is added to the score.

Now you see why it says: "Use this operator for clauses that /should/ 
appear in the matching documents. For a BooleanQuery with two |SHOULD| 
subqueries, at least one of the clauses must appear in the matching 
documents."

To get a positive score and make a hit, one of the Occur.Should clauses 
needs to be found to increase the score above 0.

- Mark

Peter Bloem wrote:
> I'm constructing a search with some required terms and some optional 
> terms in in the query. According to some earlier posts that looks like 
> "+(A B) C D E" in query syntax for required terms A and B and optional 
> terms C D and E. In other words, Lucene considers all documents that 
> have both A and B, and ranks them higher if they also have C D or E.
>
> I'm wondering how this translates to a BooleanQuery. I know I should 
> use BooleanClause.Occur.MUST for A and B, and I guess I should use 
> BooleanQuery.Occur.SHOULD for C, D and E. However the javadocs for 
> BooleanClause.Occur.SHOULD states:
>
> "Use this operator for clauses that /should/ appear in the matching 
> documents. For a BooleanQuery with two |SHOULD| subqueries, at least 
> one of the clauses must appear in the matching documents."
>
> Does this last sentence actually mean that a query with _just_ two 
> SHOULD clauses (ie. only SHOULD clauses) must contain one of the 
> clauses, or will the BooleanQuery described above actually constrain 
> the search results to (A AND B) AND (B OR C OR D)? If so, what should 
> I use instead?
>
> thank you,
> Peter
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message