incubator-lucy-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marvin Humphrey <>
Subject Re: Charmonizer
Date Thu, 06 Jul 2006 21:47:28 GMT

I think it would be better if the Charmonizer's syntax was 1) a  
little more C-like, and 2) more concise.  Here's a before-and-after  
illustrating how I think things should change:

   /* current */
     number 22
     string a string
     source CH_quote
         #include <stdio.h>
         int main() { printf("Greetings, earthlings!"); return 0; }

   /* proposed */
   CH_foo('22', 'a string', CH_q
         #include <stdio.h>
         int main() { printf("Salutations, earthlings!"); return 0; }

Labeled parameters are something I prefer generally to signatures,  
but they're not very C-like, and the small set of functions provided  
by Charmonizer does not benefit from having them.  The real reason  
they're in there is that there happened to be a happy interaction  
between parsing line-by-line and labeled params.  They should go away.

If we go to fixed argument lists, potentially with multiple args on  
one line, parsing them gets more complicated.  The easy way to handle  
this is to delimit each one with quotes -- but Charmonizer's current  
quote mechanism is cumbersome:

   /* passing the number 22 to CH_fubar */

Therefore, support for arguments delimited by single quotes and  
separated by commas should be added.

   CH_baz('22', 'twenty-two');

However, we'll keep the policy of no interpolations and no escapes --  
which keeps the parser extremely small and simple.  So if you have a  
short argument you use plain old single quotes, and if you have a  
longer argument or one which needs single quotes inside it, you use  
the extended quote mechanism.

Speaking of that extended quote mechanism, CH_quote and CH_end_quote  
should go away, to be replaced by a matching pair of 'CH_q' strings.   
That way, ' and CH_q are parallel constructs.  Plus, Huffman coding  
suggests that the CH_q delimiter should be relatively small -- though  
it still has to be long and weird enough to be unlikely to ever occur  
in stuff you'd want to quote.

Beginning all keywords with 'CH_' is a bit heavy, but it serves to  
remind you that this isn't C, it's Charmonizer.   What's more, anyone  
who's done modular programming in C is used to seeing namespaces  
faked with prefixes.  So I think we should keep that.

Closing functions with 'CH_end_function_name' was nice for creating a  
bulletproof parser, and might have been a useful constraint if there  
were going to be many such parsers.  But if there's only one, it's  
better to terminate with a semi-colon.

Delimiting argument lists with parens is not strictly necessary from  
a parsing standpoint, but it's clean and familiar.

All of this syntax is extensible.  We can add variables, escapes and  
interpolation via " and CH_iq, and more functions later if we want.   
I can think of some functions I'd like to add, so another email's on  
its way...

Sound good?

Marvin Humphrey
Rectangular Research

View raw message