lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mark Miller" <>
Subject Re: QueryParser Is Badly Broken
Date Thu, 12 Oct 2006 23:55:07 GMT
There is also the Surround Query Parser in contrib by the way...I would bet
that Paul will tell you that it does not have these issues. I can't wait to
see the replies on this one...I didn't realize that the QueryParser had
these problems and am a bit skeptical...unfortunately I am away from home
and cannot check it out.

On another note... will be done soon...I only
have not gotten around to finishing the final touches because there did not
appear to be a lot of initial interest (and what there was has waned
drastically) and I am not ready to use it myself yet. It does correctly
handle order of operations however, and as far as I know is the only parser
to handle arbitray nesting and mixing of boolean and proximity queries.
(perhaps surround does as well...I would be really interested to know, but I
assume that it handles only the base cases ie not "(car & basket) within 2
of (horse & carriage within 3 of car). Of course who really cares about such
queries, but hey ;)

You'll get better advice from others more experienced, but my bet is that
Paul's surround parser is top notch and correctly does what you want.

- Mark


On 10/12/06, Renaud Waldura <> wrote:
> I'm developing an application used by scientists -- people who have a
> pretty
> good idea of what logic is -- and they were shocked to find out that
> neither
> of these queries return the same results:
> 1- banana AND apple OR orange
> 2- banana AND (apple OR orange)
> 3- (banana AND apple) OR orange
> I'd expect (1) to be either (2) or (3), but it turns out it's parsed as
> "+banana apple orange". I was rather, uh, dismayed by this find, as it
> doesn't seem to make sense.
> I just spent half a day reading up on the various ways QueryParser is
> broken, by going through the bugs and the mailing-list archives. And I'm
> still unable to come to a conclusion. Here's where I'm at:
>     a- queries which mix boolean operators require strict parenthesizing
> to
> work right
>     b- "+" isn't shorthand for "AND"; using it with "AND"/"OR"/"NOT" and
> the
> default operator "" rarely does what you expect
>     c- the stock QueryParser doesn't work well in these cases
>     d- there's a new PrecedenceQueryParser at
> solves *some* of the issues but creates others
>     e- there is a non-Lucene effort to create a query parser with a
> different syntax at
> While we are also developing a query-building UI, users must be able to
> enter text queries as well. What do other folks do? I mean, this is pretty
> bad. I can hardly go back to my scientists and tell them Lucene is unable
> to
> handle 2 boolean operators, that they should parenthesize everything by
> hand. I mean, that's just cheesy.
> --Renaud
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message