Return-Path: X-Original-To: apmail-groovy-dev-archive@minotaur.apache.org Delivered-To: apmail-groovy-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 9E20319ACB for ; Sun, 28 Feb 2016 17:23:52 +0000 (UTC) Received: (qmail 83336 invoked by uid 500); 28 Feb 2016 17:23:52 -0000 Delivered-To: apmail-groovy-dev-archive@groovy.apache.org Received: (qmail 83289 invoked by uid 500); 28 Feb 2016 17:23:52 -0000 Mailing-List: contact dev-help@groovy.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@groovy.apache.org Delivered-To: mailing list dev@groovy.apache.org Received: (qmail 83279 invoked by uid 99); 28 Feb 2016 17:23:52 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 28 Feb 2016 17:23:52 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id C3AC01A04C0 for ; Sun, 28 Feb 2016 17:23:51 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 0.969 X-Spam-Level: X-Spam-Status: No, score=0.969 tagged_above=-999 required=6.31 tests=[HTML_MESSAGE=2, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, RP_MATCHES_RCVD=-0.329, SPF_PASS=-0.001] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id rcQwkFzhDNPU for ; Sun, 28 Feb 2016 17:23:50 +0000 (UTC) Received: from mout.gmx.net (mout.gmx.net [212.227.17.20]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id D399F5FACD for ; Sun, 28 Feb 2016 17:23:49 +0000 (UTC) Received: from [10.197.244.240] ([80.187.104.57]) by mail.gmx.com (mrgmx101) with ESMTPSA (Nemesis) id 0MAUpK-1akUOy0uLq-00Bc5i for ; Sun, 28 Feb 2016 18:23:42 +0100 Subject: Re: More Antlr4-based Groovy parser status update To: dev@groovy.apache.org References: <29FC6596-94BC-4BE2-A9A9-2FA48680CC08@selskabet.org> From: Pascal Schumacher Message-ID: <56D32D1D.6070303@gmx.net> Date: Sun, 28 Feb 2016 18:23:41 +0100 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.6.0 MIME-Version: 1.0 In-Reply-To: Content-Type: multipart/alternative; boundary="------------020102070908090100090503" X-Provags-ID: V03:K0:PzmJMAyjeGxcEuGeKk2yFYBeNYk6jB+e+cXqoYNGcCTrOr+OUHd Om7I7v9OrxdTJLTYbKUZvqvkgW8nHD72lqtKSiAvkbbofqd1OUM1xNra/UF6sSWS9ipD3zL E+DzIu3MMyAQdnFUEvUoA7tiSfIED39vt0R6ykFZmx1egxeUkHOzTJQHrQevy0BaucUOKAU jkI5SWdAjADtLjiom2daQ== X-UI-Out-Filterresults: notjunk:1;V01:K0:WRtg5e5hJpA=:VlMh0S+I3cfJfrclRp+tyF YS2Thl/h2c2g+imfOqLQ5BKEbp3sCNMxD2mM/e06nF6kG55Nm55BPSpIE0WPQogr3IrSA2WCr AslohYJnYvy69pHp7c+SRLj12iis/0DHmwEXiZbRhgNXEhzGmOYWgyaBoXY3Xfb02ro5ASrPQ dVIZqlOhrl7kiOnKWemsQsvgKiXDgntuYS1B0siAoA5l3FH8cdkCWqp/hlmHOGxFggTshLeFM ndvTY9KfgFons+Jo3TB0WiNgPwBX4gILkdcQnYY/gnK+RYwIW4PUgukKq7+CMy+oYCZ8Hn+Au QbouHBJyomygS1jXPdNlAea/oM52vuvd4asXcdmM2kbmSSS3Md3BswasEGtlx5ZHorcV81LMW 84MY/eudWPwAZD4ct14oZATMRh22ZbbvFX1FigBn/qrsBwIBU0uUEpyJ1OVlusbGDqlSSycF9 YwPbsaooDFXwXqw2OL5AHHLuxuNCoNj3Jmx6SkSRsGp+oUv2sOXSK/AF07YxGpMyNvvvf0g3L pFSGDQNP4a90urQQqyEfuoUWnOQlQIf9HbfmLV7EkzHQQbOoZCG55X12PIW2sltagjui+v+oZ P4zjTD9P24gQVC1Ew37rwzTsnRwsGdjLWGrlFiiFv2saX9OryLHE6LctwkNiRWjDXnq7H7YM3 qfNcDgZGBp1P+q1ZT0TpSQ/DyNTmKPDnVvtRNunAzEN0q5cFzttG/eLvyIlHCpq1S6wATNOFe VVnTZeOVt6zTB/BTPcNrLlEK1jnd2NOib1GXK2IoCWEvVwyTdJJ5xtibN3NjNFwddYQsbJxtZ JD1zrrtNQ3DDYLJFTIZe9dLYozN/g== This is a multi-part message in MIME format. --------------020102070908090100090503 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Hi Jesper, thanks for the update. :) Nice to hear you are progressing. Concerning the ASTBuilder to Java conversion, there is a pull request with this at the old repo https://github.com/groovy/groovy-core/pull/513 Cheers, Pascal Am 28.02.2016 um 12:55 schrieb Jesper Steen Møller: > Hi Groovy-Dev > > Here’s another update on the progress on the Antlr4 parser, as > maintained on *https://github.com/jespersm/groovy.git* (in the > *antlr4* branch). > To play with it, try: > > $ git clone -b antlr4 https://jespersm@github.com/jespersm/groovy.git > $ cd groovy > $ gradle -PuseAntlr4=true console > > > I’ve fixed a number of issues: > > * Support method pointer operator > * Attributes/method/property names as strings/gstrings > * Real support for unary plus and minus (mimics old parser’s behaviour) > * Compilation units not ending with semicolon or newline > * Slashy strings could span lines, confusing division statements and > comments > > I can now explore the new grammar and AST building using the Console, > which is fun, but it’s very easy to find unsupported constructs. > Mapping out the full Groovy grammar from the documentation alone is > quite a task. Just today, I discovered lacking support for ‘assert’ > and for ’super’-calls. The smaller issues currently are: > > * assert > * super() > * Full Unicode letter support for identifiers > * Support identifiers as property names and map literal entry names > > > The bigger issue is with converting the ASTBuilder to pure Java, a > task I havn’t started yet. Actually, this poses a different question > for AST generation: Whether to switch from tree-walking the parse tree > (so whole tree must be kept in memory), to the listener-based > approach, where the AST is built mostly bottom-up, ensuring smaller > memory footprint. > > So you can help me with a couple of answers: > > * Memory: Is this an issue I should be focusing on — and is there a > test to baseline against? > * I’ve discovered a small issue with unary syntax. Currently, nested > unary expressions are not supported without parenthesis: Try e.g. > - -1 or + -1. Is this intentional, or just an artifact of the > precedence-refactored Java grammar? > > > -Jesper > --------------020102070908090100090503 Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: 8bit
Hi Jesper,

thanks for the update. :) Nice to hear you are progressing.

Concerning the ASTBuilder to Java conversion, there is a pull request with this at the old repo https://github.com/groovy/groovy-core/pull/513

Cheers,
Pascal

Am 28.02.2016 um 12:55 schrieb Jesper Steen Møller:
Hi Groovy-Dev

Here’s another update on the progress on the Antlr4 parser, as maintained on https://github.com/jespersm/groovy.git (in the antlr4 branch).
To play with it, try:

$ cd groovy
$ gradle -PuseAntlr4=true console

I’ve fixed a number of issues:
  • Support method pointer operator
  • Attributes/method/property names as strings/gstrings
  • Real support for unary plus and minus (mimics old parser’s behaviour)
  • Compilation units not ending with semicolon or newline
  • Slashy strings could span lines, confusing division statements and comments
I can now explore the new grammar and AST building using the Console, which is fun, but it’s very easy to find unsupported constructs. Mapping out the full Groovy grammar from the documentation alone is quite a task. Just today, I discovered lacking support for ‘assert’ and for ’super’-calls. The smaller issues currently are:
  • assert
  • super()
  • Full Unicode letter support  for identifiers
  • Support identifiers as property names and map literal entry names

The bigger issue is with converting the ASTBuilder to pure Java, a task I havn’t started yet. Actually, this poses a different question for AST generation: Whether to switch from tree-walking the parse tree (so whole tree must be kept in memory), to the listener-based approach, where the AST is built mostly bottom-up, ensuring smaller memory footprint.

So you can help me with a couple of answers:
  • Memory: Is this an issue I should be focusing on — and is there a test to baseline against?
  • I’ve discovered a small issue with unary syntax. Currently, nested unary expressions are not supported without parenthesis: Try e.g. - -1 or + -1. Is this intentional, or just an artifact of the precedence-refactored Java grammar?

-Jesper


--------------020102070908090100090503--