incubator-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gaƫl Lalire <gael.lal...@gmail.com>
Subject [grammar] pure java syntax analyzer (this is not a compiler of compiler)
Date Sun, 15 Nov 2009 02:18:51 GMT
Hello,

Today if you want to do syntax analysis, you have to use a compiler of compiler (bison, yacc,
javaCC ...) which generates source code.
After generation, you need to compile your generated source code and then you can parse some
input.

I dislike this method because :
- you need to learn a meta-language (the language which describe the grammar)
- reusability of grammar is excluded
- the grammar cannot be dynamic (self described)

However I did not found any dynamical grammar analyzer so I decided to write it.
I separate my project in 3 modules :
- API : define Token, Terminal, Grammar, exceptions, ... ; A lexical analyzer have to depends
on this module to send terminals to the syntax analyzer
- Impl : The analyzers (LR, LL, SLR, ...) and some calculation utilities.
- SPI : This module provide user friendly abstract classes. For example, if you create a grammar
using this module there will be a type checking (generics) on non-terminals
and its rules, so you will be sure that there will be no ClassCastException. It also provide
a easy way to create arithmetic expression (you just need to provide terminals, the helper
will create the rules).

Why donate to apache ?
I hope that I'm not the only one interested by having a runtime grammar tool, and I hope I
could create a community on this project (because I'm alone).
This isn't an easy domain, there is many things I do not know about compilation, so a community
could bring speeder or new implementations.
Also apache is well-known in university, which could be interested on this project for practical
exercises.

Future tasks :
- LL(*) to create (abstract LL exists and is untested)
- LR(1+) to create (I need documentations)
- LALR(*) to create (need documentations too)
- SLR(2+) to create (need documentations always)
- Naming issue (bad english, bad words ...)
- Comments
- Tutorial
- Find a way to serialize the grammar's states (actions on terminal input maps) and restore
it with a simple bindings of terminals instead of grammar analyze.
- Create bindings with a lexer (ORO ? JDK ?)
- More reusable grammar part (boolean expression, sql parser, ...)
- Parsing error management
- Create a BNF to the SPI convertor
- ...

I join a source code to this mail.
This is a maven2 / eclipse project [eclipse is not mandatory but the .project is provided]

Now I need a champion (If I understand the mechanism : apache rule are not simple) to integrate
the project.

Best regards,
Gael Lalire


Mime
View raw message