lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Luke Hospadaruk <>
Subject FW: Formal grammar for solr/lucene
Date Fri, 28 Sep 2012 14:44:27 GMT
Not sure if the dev list is the right place for this, so feel free to
direct me elsewhere.

I'm wondering if there's a formal grammar for:
a) the standard query parser.
b) the dismax or edismax query parsers.

We're building a fairly complex search application on top of solr, and we
need to be able to report syntax errors back to users, modify some syntax
before it gets sent to solr (for legacy search syntax we have to support),
etc.  We permit a fairly large number of advanced syntaxes for some users,
but we also support a lot of more basic users, so we need to be able to
correct their queries and provide good feedback as to why things aren't

Right now we have a parser written in python (our application is all
python) that mostly works, but it's starting to collect a lot of special
cases and hacks, and we'd really like to convert it to something more
formal before it turns into ugly spaghetti.  I've been considering using
PLY (Python Lex-Yacc) and writing a lex grammar, or maybe writing
something similar that will do parsing based on an easy to
read/change/understand grammar file of my own design.

Is there somewhere formal documentation of the grammars used by the query
parsers, or baked into the source somewhere that's clear enough to build a
grammar from?  I could build up a grammar that would do a lot of the
things I need it it do myself just based on the user documentation, and
that would be pretty good, but it'd be nice if there were a solid grammar
I could start with to avoid some of the more strange queries.


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message