lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Morus Walter <morus.wal...@tanto.de>
Subject Re: QueryParser refactoring
Date Tue, 08 Mar 2005 09:38:21 GMT
Erik Hatcher writes:
> > ok.
> > I'm a bit puzzled, since I called javacc myself, so generated files 
> > should
> > not matter, but if it's fixed, I don't care about what went wrong.
> 
> Let me know if there is still an issue, though I added this exact case 
> to TestPrecedenceQueryParser and its currently working for me.
> 
I'll check this evening (I did that at home and don't have that stuff
in the office).

> >>> 2) Single term queries using +/- flags are parse to a query without
> >>> flag
> >>> +a -> a
> >>
> >> Hmmm.... this is a debatable one.  It's returning a TermQuery in this
> >> case for "a".  Is that appropriate?  Or should it return a 
> >> BooleanQuery
> >> with a single TermQuery as required?
> >>
> > I'd prefer, if query parser parses queries created by query.toString()
> > to the same query. But that's just a nice to have.
> 
> It's also an impossibility to have.  Here's a simple example, take a 
> Query that is equivalent to A OR B, .toString equals "A B", then parse 
> that with the default operator set to AND and you'll get "+A + B". 

?
Of course toString creates queries for default operator OR and using 
default operator AND will create different queries.
But "A B" would be parsed to "A B" using default operator OR.
I wouldn't expect more.

> I created a modified Query->String converter for my current day time 
> project (as I use a String representation for the most recently used 
> drop-down that is stored as a client-side cookie) that explicitly puts 
> in "OR" between SHOULD BooleanClauses.
> 
You cannot do that in the general case. BooleanQuery allows for queries
that cannot be expressed only using AND/OR/NOT.
E.g. `a b +c'. That's not `(a OR b) AND c' nor any other boolean expression.
It's also not expressable in QP syntax for default operator AND.

Basically QP (with default OR) allows for two different approaches:
a) you may construct a boolean query specifying the required/prohibited
    flags for each clause using +/-
b) you may use boolean expressions using AND/OR/NOT.

toString creates queries of form a).
Using default AND makes version a) less useful, since you cannot create
clauses that are neither requried nor prohibited any longer.

> > Ok.
> > The question how to handle BooleanQueries, that contain prohibited 
> > terms
> > only, is a question on it's own.
> > In my fix I choose to silently drop these queries. Basically because 
> > it's
> > effectivly dropped during querying anyway.
> 
> Silently drop as in you removed them entirely from the resultant Query?
> 
Right. `a AND (NOT b)'  parses to `a'

> That'd be easy enough to add - but is that what we want to happen?  
> Community, thoughts?
> 
Throwing an exception is presumably the other alternative.
Could that check be done in an overwritable method?

> > In an application, I handled this by dropping the query and notifying 
> > the
> > user, that some part of the query could not be handled and was ignored.
> 
> How did your application notice that part of the query was dropped?
> 
QueryParser told it ;-)
I used a modified query parser that was provided with two StringBuffers
and QP filled one with droped stop words (something to report to the user
as well) and the other with droped subqueries.

Of course this could be done with a nicer api if one restriced query
parsing to a instance method.
In that case QP would keep the information in instance variables and provide
getter methods.
Having a list of recognized tokens could be helpful as well (e.g. to create
spelling suggestions).

Morus

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message