lucy-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marvin Humphrey <>
Subject Re: [lucy-dev] Re: [KinoSearch] Stopwords and AND queries
Date Fri, 17 Dec 2010 00:46:40 GMT
On Thu, Dec 16, 2010 at 12:42:53PM -0500, Robert Muir wrote:
> FWIW (definitely not trying to imply its the best!), NULL is what
> lucene-java does if the Analyzer returns zero tokens.  This means it has to
> be careful too in all the other processing, for example the code that
> applies boost has to handle the query == null case

I had a discussion with my colleagues here at Eventful about what should a
query like this ought to return:

    foo AND ()

We concluded that if you built such a query programmatically (i.e. using the
OO constructors), then it should return nothing.  That's how it works now,
because empty ANDQuery and empty ORQuery both compile down to NULL Matchers --
those empty parens fail to match, so the parent ANDQuery also fails.

In contrast, we agreed that how a tolerant, user-facing query parser such as
Lucy::Search::QueryParser ought to handle such a query string was ambiguous.

It seems reasonable to pursue a resolution to the current bug by with a minor
mod limited to the query parsing stage.  I don't think we ought to touch the
Query classes themselves.

Put another way... Analysis, including application of stoplists, belongs to
the query-parsing stage of compilation, and not to the lower level of
compiling a Query object down to a Matcher.  There's no way in Lucy to produce
a PolyQuery (the parent class for
ANDQuery/ORQuery/NOTQuery/RequiredOptionalQuery) which has one or more NULL
child queries.  We don't have semantics for what a NULL child query would
mean, and I don't think we should add such semantics.  We should avoid
following Lucene's example in this case.

Marvin Humphrey

View raw message