cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benjamin Lerer (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-12598) BailErrorStragery alike for ANTLR grammar parsing
Date Thu, 06 Oct 2016 14:38:20 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-12598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15552096#comment-15552096
] 

Benjamin Lerer commented on CASSANDRA-12598:
--------------------------------------------

ANTLR 3.5.2 perform recovery at 2 places: in [BaseRecognizer::recoverFromMismatchedToken|https://github.com/antlr/antlr3/blob/master/runtime/Java/src/main/java/org/antlr/runtime/BaseRecognizer.java#L588]
and 
in [BaseRecognizer::recover|https://github.com/antlr/antlr3/blob/master/runtime/Java/src/main/java/org/antlr/runtime/BaseRecognizer.java#L353].
 
[BaseRecognizer::recoverFromMismatchedSet|https://github.com/antlr/antlr3/blob/master/runtime/Java/src/main/java/org/antlr/runtime/BaseRecognizer.java#L621]
is not used anymore.

In case of mismatch, {{BaseRecognizer::recoverFromMismatchedToken}} will attend to recover
by either deleting the unexpected token or inserting the missing one, reporting an error in
both cases.
If those 2 technics do not work it will throw a {{MismatchedTokenException}}. That Exception
will be catch by the default code that ANTLR generates at the end of each rule’s method:
{code}
catch (RecognitionException re) {
    reportError(re);
    recover(input,re);
}
{code}

By consequence, the right way to disable recovery is to override {{BaseRecognizer::recoverFromMismatchedToken}}
to make it always throw a {{MismatchedTokenException}} and
to either override the catch code generated by ANTLR using the {{rulecatch}} action or simply
override {{BaseRecognizer::recover}} to make it does nothing.
As you seems to be in favor of the bulletproof approach, overriding {{BaseRecognizer::recover}}
sound like the best option.

The final version of my experimentations is:
|[trunk|https://github.com/blerer/cassandra/tree/12598-trunk]|[utests|http://cassci.datastax.com/view/Dev/view/blerer/job/blerer-12598-trunk-testall/]|[dtests|http://cassci.datastax.com/view/Dev/view/blerer/job/blerer-12598-trunk-dtest/lastCompletedBuild/]|

While trying the different approaches, I realized that several rules have not been written
correctly and were preventing the generated code to compile depending on how the {{rulecatch}}
action was used. So, I fixed those rules in the provided patch.

[~Bereng] Could you have a look at the patch and tell me if it looks fine to you?

> BailErrorStragery alike for ANTLR grammar parsing
> -------------------------------------------------
>
>                 Key: CASSANDRA-12598
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-12598
>             Project: Cassandra
>          Issue Type: Bug
>          Components: CQL
>            Reporter: Berenguer Blasi
>            Assignee: Berenguer Blasi
>             Fix For: 3.x
>
>
> CQL parsing is missing a mechanism similar to http://www.antlr.org/api/Java/org/antlr/v4/runtime/BailErrorStrategy.html
> This solves:
> - Stopping parsing instead of continuing when we've got already an error which is wasteful.
> - Any skipped java code tied to 'recovered' missing tokens might later cause java exceptions
(think non-init variables, non incremented integers (div by zero), etc.) which will bubble
up directly and will hide properly formatted error messages to the user with no indication
on what went wrong at all. Just a cryptic NPE i.e



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message