lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Renaud Waldura" <>
Subject QueryParser Is Badly Broken
Date Thu, 12 Oct 2006 23:11:07 GMT
I'm developing an application used by scientists -- people who have a pretty 
good idea of what logic is -- and they were shocked to find out that neither 
of these queries return the same results:

1- banana AND apple OR orange
2- banana AND (apple OR orange)
3- (banana AND apple) OR orange

I'd expect (1) to be either (2) or (3), but it turns out it's parsed as 
"+banana apple orange". I was rather, uh, dismayed by this find, as it 
doesn't seem to make sense.

I just spent half a day reading up on the various ways QueryParser is 
broken, by going through the bugs and the mailing-list archives. And I'm 
still unable to come to a conclusion. Here's where I'm at:

    a- queries which mix boolean operators require strict parenthesizing to 
work right

    b- "+" isn't shorthand for "AND"; using it with "AND"/"OR"/"NOT" and the 
default operator "" rarely does what you expect

    c- the stock QueryParser doesn't work well in these cases

    d- there's a new PrecedenceQueryParser at that 
solves *some* of the issues but creates others

    e- there is a non-Lucene effort to create a query parser with a 
different syntax at

While we are also developing a query-building UI, users must be able to 
enter text queries as well. What do other folks do? I mean, this is pretty 
bad. I can hardly go back to my scientists and tell them Lucene is unable to 
handle 2 boolean operators, that they should parenthesize everything by 
hand. I mean, that's just cheesy.


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message