jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thomas Mueller" <thomas.tom.muel...@gmail.com>
Subject Re: master plan for jsr 283 query implementation
Date Tue, 11 Sep 2007 17:33:35 GMT

I have used JavaCC, ANTLR, and made hand-written parsers. Hand-written
parsers are more flexible:

- Returning meaningful error messages is easy
- Tokens that are sometimes identifiers and sometimes keywords
  (many in SQL) are not problematic
- Strange grammar can be supported (SQL is strange, but not sure about JCR SQL)
- You can better optimize pure Java (probably irrelevant for Jackrabbit)
- You can support conditional grammar (irrelevant for Jackrabbit)

Flexibility is not always an advantage, specially if you develop a new
language: ambiguity in the grammar is easily found using a parser
generator. On the other hand, you could write the BNF and still use a
hand-written parser. I wrote a BNF parser / auto-complete tool, of
course hand-written ;-) Maybe this would be interesting for Jackrabbit
as well (a query tool with auto-complete).

Another advantage of a hand-written parser is that there is no new
language, just Java:

- Simplifies the build process a bit
- No need to learn JavaCC / ANTLR

Many people think that hand-written parsers are hard to read, I don't think so:

void AndExpression() #void :
    UnaryExpression() (<AND> UnaryExpression())*
  ) #AndExpression(>1)

private Expression readAnd() throws ParseException {
    Expression r = readUnary();
    while (readIf("AND")) {
        r = new AndExpression(r, readUnary());
    return r;

The main functions of a hand-written parser are usually:

- read(): read a token
- boolean readIf(String token): checks if the current token is
'token', and eat it if true.
- read(String expected): eat a required token or throw an exception.

But (obviously) you will add more convenience methods.

Of course, some thing are complicated in both JavaCC and in a
hand-written parsers (in Jackrabbit, I don't understand JCRSQL.jjt,
Predicate(), line 300 - 377).

In term of 'work required': In my view, a hand-written parser requires
about the same amount of work than a JavaCC / ANTLR one, and are
easier to understand for a developer / maintainer.

> - Don't reinvent the wheel :-)
"Re-invent the wheel" would be if you write JavaCC or ANTLR yourself,
I don't suggest to do that. It's more like "using a GUI builder"
versus "writing the GUI code yourself".

> - Extensibility and easier maintenance
Having done both, I don't agree.

>  You'll be faster, too, in terms of performance and implementation time.
Performance: a hand-written one can be better optimized (if you want
to). Implementation time: it depends on if you already know the tool
or have templates (for both approaches).

> - You need to define the grammar
I agree, needs to be done for the spec.

> JavaCC can generate Javadoc like grammar documentation
I don't think this generated documentation is 'suitable for human
consumption'. So far I found this:
http://www.w3.org/2002/11/xquery-xpath-applets/xpath-jjdoc.html and I
wouldn't want to learn the grammar from this file.

> I think there is time better spent doing other things than writing parsers :-)
I agree, if writing the parser yourself actually takes more time than
using JavaCC. In my view, it doesn't, but this is just my opinion.


View raw message