lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Doron Cohen (JIRA)" <>
Subject [jira] Updated: (LUCENE-933) QueryParser can produce empty sub BooleanQueries when Analyzer proudces no tokens for input
Date Thu, 21 Jun 2007 01:35:26 GMT


Doron Cohen updated LUCENE-933:

    Attachment: lucene-933_nullify.patch

Ok attaching two different fixes (as discussed above) 
  (1)  lucene-933_backwards_comapatible.patch 
  (2)  lucene-933_nullify.patch

All tests pass with either of these.

The "nullify" approach requires more changes especially tests as well as in MemoryIndex, so,
after while fixing as required for tests to pass in this (nullifying) approach I cane to conclusion
that it is better to continue to not return null queries as result of parsing, otherwise there'll
be lots of "noise". 

So I would like to commit patch (1) - unless someone points a problem that I missed.

> QueryParser can produce empty sub BooleanQueries when Analyzer proudces no tokens for
> -------------------------------------------------------------------------------------------
>                 Key: LUCENE-933
>                 URL:
>             Project: Lucene - Java
>          Issue Type: Bug
>            Reporter: Hoss Man
>            Assignee: Doron Cohen
>         Attachments: lucene-933_backwards_comapatible.patch, lucene-933_nullify.patch
> as triggered by SOLR-261, if you have a query like this...
>    +foo:BBB  +(yak:AAA  baz:CCC)
> ...where the analyzer produces no tokens for the "yak:AAA" or "baz:CCC" portions of the
query (posisbly because they are stop words) the resulting query produced by the QueryParser
will be...
>   +foo:BBB +()
> ...that is a BooleanQuery with two required clauses, one of which is an empty BooleanQuery
with no clauses.
> this does not appear to be "good" behavior.
> In general, QueryParser should be smarter about what it does when parsing encountering
parens whose contents result in an empty BooleanQuery -- but what exactly it should do in
the following situations...
>  a)  +foo:BBB +()
>  b)  +foo:BBB ()
>  c)  +foo:BBB -()
> up for interpretation.  I would think situation (b) clearly lends itself to dropping
the sub-BooleanQuery completely.  situation (c) may also lend itself to that solution, since
semanticly it means "don't allow a match on any queries in the empty set of queries".  ....
I have no idea what the "right" thing to do for situation (a) is.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message