cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Stupp (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (CASSANDRA-7740) Parsing of UDF body is broken
Date Mon, 11 Aug 2014 12:41:12 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-7740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14092730#comment-14092730
] 

Robert Stupp edited comment on CASSANDRA-7740 at 8/11/14 12:40 PM:
-------------------------------------------------------------------

note: in CASSANDRA-7562 I've changed parsing a bit and the end marker is {{K_END K_BODY}}

But the main point (anything looking like {{END BODY}} is recognized as the end of function
body) stays. IMO it was the "lesser of two evils" (compared to single-quote-escaping).

I thought about escaping - e.g. {{'}} to {{''}} - but didn't like it. It might work for Java
- but, as you said, other script languages would look really strange.

Something like
{noformat}
CREATE FUNCTION ...
BODY
$function$
    return "END BODY";
$function$
END BODY;
{noformat}
could work - would assume, that {{$function$}} is not used within the body.

Additionally we could also support that {{'}} escaping anyway.
{noformat}
CREATE FUNCTION ...
BODY 
    'return "END BODY";'
END BODY;
{noformat}

I'd personally prefer to implement both ({{$function$...$function$}} and string).

Having that, we could also remove the {{K_END K_BODY}} completly and replace {{K_BODY}} with
{{K_AS}} - have no preference for that.

So this would incorportate three tasks:
* change parsing in cqlsh for {{$function$}}
* change parsing in cqlsh to support multi-line strings
* change parsing in C*



was (Author: snazy):
note: in CASSANDRA-7562 I've changed parsing a bit and the end marker is {{K_END K_BODY}}

But the main point (anything looking like {{END BODY}} is recognized as the end of function
body) stays. IMO it was the "lesser of two evils" (compared to single-quote-escaping).

I thought about escaping - e.g. {{'}} to {{''}} - but didn't like it. It might work for Java
- but, as you said, other script languages would look really strange.

Something like
{noformat}
CREATE FUNCTION ...
BODY
$function$
    return "END BODY";
$function$
END BODY;
{noformat}
could work - would assume, that {{$function$}} is not used within the body.

Additionally we could also support that {{'}} escaping anyway.
{noformat}
CREATE FUNCTION ...
BODY 
    'return "END BODY";'
END BODY;
{noformat}

I'd personally prefer to implement both ({{$function$...$function$}} and string).

Having that, we could also remove the {{K_END K_BODY}} completly and replace {{K_BODY}} with
{{K_AS}} - have no preference for that.

So this would incorportate two tasks:
* change parsing in cqlsh for {{$function$}}
* change parsing in cqlsh to support multi-line strings
* change parsing in C*


> Parsing of UDF body is broken
> -----------------------------
>
>                 Key: CASSANDRA-7740
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7740
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Sylvain Lebresne
>            Assignee: Robert Stupp
>
> The parsing of function body introduced by CASSANDRA-7395 is somewhat broken. It blindly
parse everything up to {{END_BODY}}, which as 2 problems:
> # it parse function body as if it was part of the CQL syntax, so anything that don't
happen to be a valid CQL token won't even parse.
> # something like
> {noformat}
> CREATE FUNCTION foo() RETURNS text LANGUAGE JAVA BODY return "END_BODY"; END_BODY;
> {noformat}
> will not parse correctly.
> I don't think we can accept random syntax like that. A better solution (which is the
one Postgresql uses) is to pass the function body as a normal string. And in fact I'd be in
favor of reusing Postgresql syntax (because why not), that is to have:
> {noformat}
> CREATE FUNCTION foo() RETURNS text LANGUAGE JAVA AS 'return "END_BODY"';
> {noformat}
> One minor annoyance might be, for certain languages, the necessity to double every quote
inside the string. But in a separate ticket we could introduce Postregsql solution of adding
an [alternate syntax for string constants|http://www.postgresql.org/docs/9.1/static/sql-syntax-lexical.html#SQL-SYNTAX-DOLLAR-QUOTING].



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message