lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Cwikla <Cwi...@Biz360.com>
Subject RE: GoogleQueryParser
Date Thu, 12 Sep 2002 19:18:27 GMT

Actually to expand a little more after a little more digging, it
appears that the AND/OR terms are being flattened into a list of
+ - or optional query terms that are used to remove/add results
one after the other.

In this sense, I think the AND, OR and NOT operators seems to have
been an afterthought to the queryparser, since they cannot give the
same results as +, - and optional.  AND, OR and NOT should use intersections
and unions, while + and - is doing strict adds or rejections.

I guess I was expecting a syntax tree, but it looks like just flattening
of terms.

cwikla

-----Original Message-----
From: John Cwikla [mailto:Cwikla@Biz360.com]
Sent: Thursday, September 12, 2002 11:53 AM
To: 'Lucene Users List'
Subject: RE: GoogleQueryParser




So here is the problem, AND default operator:

a b OR c == a AND b OR c == c OR (a AND b) == c OR (+a +b) != c b +a != c +b
+a

However with default OR operator:

a AND b c == a AND b OR c == c OR (a AND b) == c OR (+a +b) == c (+a +b)

Since AND and OR do not actually mean "required" or "optional" in a strict
boolean sense, I claim you cannot correctly use the query parser with
a default AND operator and get results that would be expected.

I haven't looked more into the QueryParser yet, but in the last case with
the
AND operator, if at some point the internal query "switched" to OR, then
the last item would be correct if it had parenthesis like c (+a +b)

Or is it early and I'm missing something?

cwikla



-----Original Message-----
From: Halácsy Péter [mailto:halacsy.peter@axelero.com]
Sent: Wednesday, September 11, 2002 11:58 PM
To: Lucene Users List
Subject: RE: GoogleQueryParser




>-----Original Message-----
>From: Philip Chan [mailto:PChan@Biz360.com]
>Sent: Wednesday, September 11, 2002 11:04 PM
>To: Lucene Users List
>Subject: RE: GoogleQueryParser
>
>
>I think there's a bug, if I set the default operator to be OR, 
>when I run
>
>java org.apache.lucene.queryParser.QueryParser "a AND b OR c"
>
>it will give me the result of "+a +b c"
>if I set the default operator to be AND, and run it with the 
>term "a b OR
>c", it will give me "+a b c", which is different

To be exact it's not a bug, it's feature ;) Well, the structured query
language of Lucene (and Google and others) is not a strict boolean language.
For example I think the QueryParser of Lucene do not support parenthesis: a
AND (b OR C) 

Instead of strict boolean logic it supports constraint on query terms: a
query term is either required or optional or prohibited. If you write + sign
before the term it will be required. If you write - it will be prohibited.
The question is: is a term required or optional if you do not specify
anything? 

DEFAULT_OPERATOR_OR (default QueryParser):
A B C --> all three terms are optional

DEFAULT_OPERATOR_AND (Google style):
A B C --> +A +B +C all three terms are required.

Because a b OR c query is not a strict boolean query, the query parser can
choose how to translate it. +a +b c not too good since doesn't equal to the
result of input query c OR a b

peter






>
>-----Original Message-----
>From: Halácsy Péter [mailto:halacsy.peter@axelero.com]
>Sent: Wednesday, September 11, 2002 4:49 AM
>To: Lucene Users List; Clemens Marschner
>Subject: RE: GoogleQueryParser
>
>
>
>
>>-----Original Message-----
>>From: Eric Jain [mailto:Eric.Jain@isb-sib.ch]
>>Sent: Wednesday, September 11, 2002 1:44 PM
>>To: Clemens Marschner
>>Cc: Lucene Users List
>>Subject: Re: GoogleQueryParser
>>
>>
>>> queryParser.setOperator(QueryParser.DEFAULT_OPERATOR_AND);
>>
>>Thanks, that would be exactely what I need. Must be a new 
>>method, not yet in
>>the public release?
>>
>check out the new QueryParser from the cvs
>
>peter
>
>--
>To unsubscribe, e-mail:
><mailto:lucene-user-unsubscribe@jakarta.apache.org>
>For additional commands, e-mail:
><mailto:lucene-user-help@jakarta.apache.org>
>
>--
>To unsubscribe, e-mail:   
<mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail:
<mailto:lucene-user-help@jakarta.apache.org>


--
To unsubscribe, e-mail:
<mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail:
<mailto:lucene-user-help@jakarta.apache.org>

--
To unsubscribe, e-mail:
<mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail:
<mailto:lucene-user-help@jakarta.apache.org>

--
To unsubscribe, e-mail:   <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>


Mime
View raw message