cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stephan Michels <step...@vern.chem.tu-berlin.de>
Subject Re[2]: text parser
Date Wed, 13 Feb 2002 13:17:03 GMT


On Wed, 13 Feb 2002, Andrew Answer wrote:

> > Finding a name isn't so easy as I think. :(
>
>   Hmmm.... why you reject project name if domain-name are reserved?
>   Number of projects more than number of domains :)

No, but most of them had a copyright of their name, i think...

>   Well-driven engine! It's look like XML parser...
>   Suggestions:
>   I'm worked with byacc/flex, but already forget his syntax. May be
>   better to make DTD of your grammar more readable?

Before I used a text form of a grammer like this

%token Identifier [A-Za-z][A-Za-z0-9_]*;
%token Number [+\-]? ( ([0-9]+ \.[0-9]*)?(E[+\-]?[0-9]+)?  |
[0-9]*\.[0-9]+(E[+\-]?[0-9]+)? );
%token String \" ( [^\\\"] | \\[^u] |
\\u[0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F] )* \";
%token Boolean TRUE|FALSE;
%comment #([^\n]*);
%whitespace [\n\r\t\u0020]+;
%start File;

%%

Tupel : (String | Number) (String | Number)+;
Field : "[" (String | Number | Tupel) ("," (String | Number |
Tupel) )* "]";
Declaration : Identifier (String | Number | Tupel | Field);
Node : Identifier "{" (Node | Declaration)* "}";
File : (Node | Declaration)+;


>   Then, you can even write stylesheet for converting byacc grammar
>   into your grammar. And use it with your parser - it's a good test.

This I have already done ;-)

>   How about whitespaces? Unlike XML, text files need to recognize one
>   or two CR/LF and apply different rules, etc...

Whats about

<token tsymbol="eol">
 <alt>
  <concat>
   <string content="\r"\><string minOccurs="0" content="\n"\>
  </concat>
  <string content="\n"\>
 </alt>
</token>

>   May be you can to produce one text from another (line formatting,
>   adjusting, lists formatting, etc)?

I think yes, with a little work, you could do something like
syntaxhighlighting.

>   And later you can transmute it into Generator/Transformer (but you must
>   produce SAX stream for right work, i think)...

The (fragment) text parser generator produce a SAX stream, which could be
transformed in a pipeline.



---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org


Mime
View raw message