lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Burton-West, Tom" <tburt...@umich.edu>
Subject re: LUCENE-167 and Solr default handling of Boolean operators is broken
Date Thu, 01 Dec 2011 17:51:06 GMT
The default query parser in Solr does not handle precedence of Boolean operators in the way
most people expect.

"A AND B OR C" gets interpreted as "A AND (B OR C)" . There are numerous other examples in
the JIRA ticket for Lucene 167, this article on the wiki http://wiki.apache.org/lucene-java/BooleanQuerySyntax
and in this blog post: http://robotlibrarian.billdueber.com/solr-and-boolean-operators/

This issue was reported in 2003 but the fix does not seem to have made it into the default
query parser for either Lucene or Solr

It appears that Lucene 167 was closed in 2009 based on the assumption that the query parser
in Lucene 1823 would become the default Lucene query parser.  However 1823 seems to have gotten
bogged down and is not yet resolved.  I do see that there is a precedence query parser in
LUCENE-1937  which was committed to contrib. in  the 3x branch:(http://svn.apache.org/viewvc/lucene/dev/branches/branch_3x/lucene/contrib/queryparser/src/java/org/apache/lucene/queryParser/precedence/package.html?view=co)

Would it be possible to use the contrib 3x  precedence query parser in Solr?
Would this require modifying the LuceneQParserPlugin and if so would it make sense to open
a JIRA issue?

Are there any plans to make the precedence query parser the default for either Lucene or Solr?

If not, are there any plans to make it more prominent in the documentation that the default
Lucene query parser has issues with precedence?


A bit more background below

Tom Burton-West
http://www.hathitrust.org/blogs/large-scale-search
----------------------------------------------------

More Background

There were some concerns about breaking backward compatibility but in a mailing list post
in 2005  Yonik  Sealy said:
"The current behavior is so surprising that I doubt  that no one is
relying on it."  (http://www.mail-archive.com/java-user@lucene.apache.org/msg00018.html)

and Doug Cutting said  "+1. Fixing operator precedence seems to me like an acceptable incompatibility.
The change needs to be well documented in release notes, and the old QueryParser should be
available, deprecated, for a time for back-compatibility."
(http://www.mail-archive.com/java-user@lucene.apache.org/msg00037.html)




Mime
View raw message