incubator-lucy-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marvin Humphrey <mar...@rectangular.com>
Subject [lucy-dev] Host QueryParser reimplementations
Date Mon, 18 Apr 2011 19:39:46 GMT
On Fri, Apr 15, 2011 at 10:27:28PM -0500, Peter Karman wrote:
> The reason I join Marvin in wishing to avoid the QueryParser wars is that
> parsers are notoriously hard to get 100% correct for the 80% of features most
> applications require. And for the other 20% it really becomes application
> specific as to how the parser should behave. 

Well stated. :)

> I think the focus on reliable, flexible *Query classes has been a good design
> choice to date, because it means that it is quite straightforward to roll your
> own query parser (as I have done), entirely suited to your application's needs,
> sidestepping the Lucy QueryParser altogether. That's good library design, imo.

Perhaps we can build on that foundation.

I'm thinking of starting a project at apache-extras.org, with the working
title "LucyX::Search::HostQueryParser".  Its primary purpose would be to
supply hackable reimplementations of Lucy::Search::QueryParser in the host
language, typically using a parser generator -- Parse::RecDescent for Perl,
maybe pyparsing for Python, etc.

To ensure that the reimplementations are faithful, Apache Lucy would expose
tests for Lucy's QueryParser as a public API.

The second thing the HostQueryParser project would supply is sample code for
alternative query parsers.  It would be liberal about accepting contributions;
the sample code would come with no security or correctness guarantees.  If
people want e.g. a "strict" parser, they can contribute sample code to
HostQueryParser, or they can release their own extension.

There are several motivations for this idea.

  * It provides a constructive way to harness the creative energy that
    people bring to the task of query parsing, which might ordinarily be
    squandered in QueryParser wars over whose last 20%
    Lucy::Search::QueryParser should implement.
  * It establishes a precedent of using apache-extras.org to host
    Lucy-related extensions.
  * It encourages us to approach Lucy's QueryParser with increased
    discipline, requiring that we define a spec and elevating the importance
    of tests.

Marvin Humphrey


Mime
View raw message