lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erik Hatcher <e...@ehatchersolutions.com>
Subject Re: QueryParser refactoring
Date Tue, 08 Mar 2005 13:46:43 GMT

On Mar 8, 2005, at 4:38 AM, Morus Walter wrote:
>> I created a modified Query->String converter for my current day time
>> project (as I use a String representation for the most recently used
>> drop-down that is stored as a client-side cookie) that explicitly puts
>> in "OR" between SHOULD BooleanClauses.
>>
> You cannot do that in the general case. BooleanQuery allows for queries
> that cannot be expressed only using AND/OR/NOT.
> E.g. `a b +c'. That's not `(a OR b) AND c' nor any other boolean 
> expression.
> It's also not expressable in QP syntax for default operator AND.

I just added this bit of code to my QueryConverterTest:

     BooleanQuery bq = new BooleanQuery();
     bq.add(new TermQuery(new Term("field", "a")), 
BooleanClause.Occur.SHOULD);
     bq.add(new TermQuery(new Term("field", "b")), 
BooleanClause.Occur.SHOULD);
     bq.add(new TermQuery(new Term("field", "c")), 
BooleanClause.Occur.MUST);

     assertEquals("a OR b +c", QueryConverter.convert(bq, "field"));

     System.out.println(QueryParser.parse(QueryConverter.convert(bq, 
"field"), "field", new SimpleAnalyzer()));

The output is:
	a OR b +c

And the test passes and the query is expressible in QP syntax for 
AND.... unless I'm missing something obvious here.

>> Silently drop as in you removed them entirely from the resultant 
>> Query?
>>
> Right. `a AND (NOT b)'  parses to `a'

Is this what we want to happen for a general purpose next generation 
Lucene QueryParser though?  I'm not sure.  Perhaps this should be a 
ParseException instead?

>> That'd be easy enough to add - but is that what we want to happen?
>> Community, thoughts?
>>
> Throwing an exception is presumably the other alternative.
> Could that check be done in an overwritable method?

Good point.  Sure.

>>> In an application, I handled this by dropping the query and notifying
>>> the
>>> user, that some part of the query could not be handled and was 
>>> ignored.
>>
>> How did your application notice that part of the query was dropped?
>>
> QueryParser told it ;-)
> I used a modified query parser that was provided with two StringBuffers
> and QP filled one with droped stop words (something to report to the 
> user
> as well) and the other with droped subqueries.

Perhaps QueryParser should have some additional hooks allowing you to 
either subclass and tap into things or pass in some sort of custom 
listener hook?

I'm interested in how you trapped the stop words that were removed - 
did you use a custom analyzer that gave you this information?  Or some 
other technique?

> Of course this could be done with a nicer api if one restriced query
> parsing to a instance method.

We can most definitely do that!  Any objections to removing the static 
parse method?

> In that case QP would keep the information in instance variables and 
> provide
> getter methods.
> Having a list of recognized tokens could be helpful as well (e.g. to 
> create
> spelling suggestions).

We're getting vastly more sophisticated than just correcting the order 
of precedence.  I can continue to fiddle with this over time, but I 
won't be able to dedicate a solid effort to all of these.  Feel free to 
take contribute patches that enhance it.  Let me know if my 
PrecedenceQueryParser needs tweaks to be a usable base or if starting 
over with QueryParser is better.

	Erik


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message