Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 0B590200BB0 for ; Sun, 30 Oct 2016 18:47:05 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 09D29160AF1; Sun, 30 Oct 2016 17:47:05 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id AA4E3160ADD for ; Sun, 30 Oct 2016 18:47:03 +0100 (CET) Received: (qmail 1079 invoked by uid 500); 30 Oct 2016 17:47:02 -0000 Mailing-List: contact notifications-help@asterixdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@asterixdb.apache.org Delivered-To: mailing list notifications@asterixdb.apache.org Received: (qmail 1070 invoked by uid 99); 30 Oct 2016 17:47:02 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 30 Oct 2016 17:47:02 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 75ACB1806A1 for ; Sun, 30 Oct 2016 17:47:02 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 0.919 X-Spam-Level: X-Spam-Status: No, score=0.919 tagged_above=-999 required=6.31 tests=[SPF_FAIL=0.919] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id 783KnzOXRH8Y for ; Sun, 30 Oct 2016 17:46:58 +0000 (UTC) Received: from unhygienix.ics.uci.edu (unhygienix.ics.uci.edu [128.195.14.130]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id 0A8885FC2B for ; Sun, 30 Oct 2016 17:46:57 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by unhygienix.ics.uci.edu (Postfix) with ESMTP id 3AC4424208C; Sun, 30 Oct 2016 10:46:56 -0700 (PDT) Date: Sun, 30 Oct 2016 10:46:56 -0700 From: "Yingyi Bu (Code Review)" Message-ID: Reply-To: buyingyi@gmail.com X-Gerrit-MessageType: newchange Subject: Change in asterixdb[master]: Address Don's comments in the expression doc. X-Gerrit-Change-Id: I224a706aa987a0d938ab22b9ae28660ef6433991 X-Gerrit-ChangeURL: X-Gerrit-Commit: dd3bf275831d2dd6d9ceabc2177aca4c4988ca26 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Content-Disposition: inline User-Agent: Gerrit/2.8.4 To: undisclosed-recipients:; archived-at: Sun, 30 Oct 2016 17:47:05 -0000 Yingyi Bu has uploaded a new change for review. https://asterix-gerrit.ics.uci.edu/1327 Change subject: Address Don's comments in the expression doc. ...................................................................... Address Don's comments in the expression doc. Change-Id: I224a706aa987a0d938ab22b9ae28660ef6433991 --- M asterixdb/asterix-doc/src/main/markdown/sqlpp/0_toc.md M asterixdb/asterix-doc/src/main/markdown/sqlpp/2_expr.md M asterixdb/asterix-lang-sqlpp/src/main/javacc/SQLPP.jj 3 files changed, 224 insertions(+), 162 deletions(-) git pull ssh://asterix-gerrit.ics.uci.edu:29418/asterixdb refs/changes/27/1327/1 diff --git a/asterixdb/asterix-doc/src/main/markdown/sqlpp/0_toc.md b/asterixdb/asterix-doc/src/main/markdown/sqlpp/0_toc.md index b04ea6a..ff31357 100644 --- a/asterixdb/asterix-doc/src/main/markdown/sqlpp/0_toc.md +++ b/asterixdb/asterix-doc/src/main/markdown/sqlpp/0_toc.md @@ -23,13 +23,6 @@ * [1. Introduction](#Introduction) * [2. Expressions](#Expressions) - * [Primary expressions](#Primary_expressions) - * [Literals](#Literals) - * [Variable references](#Variable_references) - * [Parenthesized expressions](#Parenthesized_expressions) - * [Function call expressions](#Function_call_expressions) - * [Constructors](#Constructors) - * [Path expressions](#Path_expressions) * [Operator expressions](#Operator_expressions) * [Arithmetic operators](#Arithmetic_operators) * [Collection operators](#Collection_operators) @@ -37,6 +30,13 @@ * [Logical operators](#Logical_operators) * [Case expressions](#Case_expressions) * [Quantified expressions](#Quantified_expressions) + * [Path expressions](#Path_expressions) + * [Primary expressions](#Primary_expressions) + * [Literals](#Literals) + * [Variable references](#Variable_references) + * [Parenthesized expressions](#Parenthesized_expressions) + * [Function call expressions](#Function_call_expressions) + * [Constructors](#Constructors) * [3. Queries](#Queries) * [SELECT statements](#SELECT_statements) * [SELECT clauses](#Select_clauses) diff --git a/asterixdb/asterix-doc/src/main/markdown/sqlpp/2_expr.md b/asterixdb/asterix-doc/src/main/markdown/sqlpp/2_expr.md index 17cf9bf..f3d4311 100644 --- a/asterixdb/asterix-doc/src/main/markdown/sqlpp/2_expr.md +++ b/asterixdb/asterix-doc/src/main/markdown/sqlpp/2_expr.md @@ -21,152 +21,16 @@ Expression ::= OperatorExpression | CaseExpression | QuantifiedExpression -SQL++ is a highly composable expression language. Each SQL++ expression returns zero or more data model instances. There are three major kinds of expressions in SQL++. At the topmost level, a SQL++ expression can be an OperatorExpression (similar to a mathematical expression), an ConditionalExpression (to choose between alternative values), or a QuantifiedExpression (which yields a boolean value). Each will be detailed as we explore the full SQL++ grammar. +SQL++ is a highly composable expression language. Each SQL++ expression returns zero or more data model instances. +There are three major kinds of expressions in SQL++. At the topmost level, a SQL++ expression can be an +OperatorExpression (similar to a mathematical expression), an ConditionalExpression (to choose between +alternative values), or a QuantifiedExpression (which yields a boolean value). Each will be detailed as we +explore the full SQL++ grammar. -## Primary Expressions +Note that in the following text, words enclosed in angle brackets denote keywords that are not case-sensitive. - PrimaryExpr ::= Literal - | VariableReference - | ParenthesizedExpression - | FunctionCallExpression - | Constructor -The most basic building block for any SQL++ expression is PrimaryExpression. This can be a simple literal (constant) -value, a reference to a query variable that is in scope, a parenthesized expression, a function call, or a newly -constructed instance of the data model (such as a newly constructed object, array, or multiset of data model instances). - -### Literals - - Literal ::= StringLiteral - | IntegerLiteral - | FloatLiteral - | DoubleLiteral - | - | - | - | - StringLiteral ::= "\'" ( | ~["\'"])* "\'" - | "\"" ( | ~["\'"])* "\"" - ::= "\\\'" - ::= "\\\"" - IntegerLiteral ::= - ::= ["0" - "9"]+ - FloatLiteral ::= ( "f" | "F" ) - | ( "." ( "f" | "F" ) )? - | "." ( "f" | "F" ) - DoubleLiteral ::= - | ( "." )? - | "." - -Literals (constants) in SQL++ can be strings, integers, floating point values, double values, boolean constants, or special constant values like `NULL` and `MISSING`. The `NULL` value is like a `NULL` in SQL; it is used to represent an unknown field value. The specialy value `MISSING` is only meaningful in the context of SQL++ field accesses; it occurs when the accessed field simply does not exist at all in a object being accessed. - -The following are some simple examples of SQL++ literals. - -##### Examples - - 'a string' - "test string" - 42 - -Different from standard SQL, double quotes play the same role as single quotes and may be used for string literals in SQL++. - -### Variable References - - VariableReference ::= | - ::= ( | | "_" | "$")* - ::= ["A" - "Z", "a" - "z"] - DelimitedIdentifier ::= "\`" ( | ~["\'"])* "\`" - -A variable in SQL++ can be bound to any legal data model value. A variable reference refers to the value to which an in-scope variable is bound. (E.g., a variable binding may originate from one of the `FROM`, `WITH` or `LET` clauses of a `SELECT` statement or from an input parameter in the context of a function body.) Backticks, e.g., \`id\`, are used for delimited identifiers. Delimiting is needed when a variable's desired name clashes with a SQL++ keyword or includes characters not allowed in regular identifiers. - -##### Examples - - tweet - id - `SELECT` - `my-function` - -### Parenthesized expressions - - ParenthesizedExpression ::= "(" Expression ")" | Subquery - -An expression can be parenthesized to control the precedence order or otherwise clarify a query. In SQL++, for composability, a subquery is also an parenthesized expression. - -The following expression evaluates to the value 2. - -##### Example - - ( 1 + 1 ) - -### Function call expressions - - FunctionCallExpression ::= FunctionName "(" ( Expression ( "," Expression )* )? ")" - -Functions are included in SQL++, like most languages, as a way to package useful functionality or to componentize complicated or reusable SQL++ computations. A function call is a legal SQL++ query expression that represents the value resulting from the evaluation of its body expression with the given parameter bindings; the parameter value bindings can themselves be any SQL++ expressions. - -The following example is a (built-in) function call expression whose value is 8. - -##### Example - - length('a string') - -### Constructors - - CollectionConstructor ::= ArrayConstructor | MultisetConstructor - ArrayConstructor ::= "[" ( Expression ( "," Expression )* )? "]" - MultisetConstructor ::= "{{" ( Expression ( "," Expression )* )? "}}" - ObjectConstructor ::= "{" ( FieldBinding ( "," FieldBinding )* )? "}" - FieldBinding ::= Expression ":" Expression - -A major feature of SQL++ is its ability to construct new data model instances. This is accomplished using its constructors -for each of the model's complex object structures, namely arrays, multisets, and objects. -Arrays are like JSON arrays, while multisets have bag semantics. -Objects are built from fields that are field-name/field-value pairs, again like JSON. -(See the [data model document](../datamodel.html) for more details on each.) - -The following examples illustrate how to construct a new array with 3 items, a new object with 2 fields, -and a new multiset with 4 items, respectively. Array elements or multiset elements can be homogeneous (as in -the first example), -which is the common case, or they may be heterogeneous (as in the third example). The data values and field name values -used to construct arrays, multisets, and objects in constructors are all simply SQL++ expressions. Thus, the collection elements, -field names, and field values used in constructors can be simple literals or they can come from query variable references -or even arbitrarily complex SQL++ expressions (subqueries). - -##### Examples - - [ 'a', 'b', 'c' ] - - { - 'project name': 'Hyracks', - 'project members': [ 'vinayakb', 'dtabass', 'chenli', 'tsotras', 'tillw' ] - } - - {{ 42, "forty-two!", { "rank": "Captain", "name": "America" }, 3.14159 }} - -### Path expressions - - PathExpression ::= PrimaryExpression ( Field | Index )* - Field ::= "." Identifier - Index ::= "[" ( Expression | "?" ) "]" - -Components of complex types in the data model are accessed via path expressions. Path access can be applied to the result -of a SQL++ expression that yields an instance of a complex type, e.g., a object or array instance. For objects, -path access is based on field names. For arrays, path access is based on (zero-based) array-style indexing. -SQL++ also supports an "I'm feeling lucky" style index accessor, [?], for selecting an arbitrary element from an array. - Attempts to access non-existent fields or out-of-bound array elements produce the special value `MISSING`. - -The following examples illustrate field access for a object, index-based element access for an array, and also a -composition thereof. - -##### Examples - - ({"name": "MyABCs", "array": [ "a", "b", "c"]}).array - - (["a", "b", "c"])[2] - - ({"name": "MyABCs", "array": [ "a", "b", "c"]}).array[2] - -### Operator expressions +## Operator expressions Operators perform a specific operation on the input values or expressions. The syntax of an operator expression is as follows: @@ -188,7 +52,7 @@ |-----------------------------------------------------------------------------|-----------| | EXISTS, NOT EXISTS | collection emptiness testing | | ^ | exponentiation | -| *, / | multiplication, division | +| *, /, % | multiplication, division, modulo | | +, - | addition, subtraction | | || | string concatenation | | IS NULL, IS NOT NULL, IS MISSING, IS NOT MISSING,
IS UNKNOWN, IS NOT UNKNOWN| unknown value comparison | @@ -197,6 +61,11 @@ | NOT | logical negation | | AND | conjunction | | OR | disjunction | + +In general, if any operand evaluates to a `MISSING` value, the enclosing operator will return `MISSING`; +if none of operands evaluates to a `MISSING` value but there is an operand evaluates to a `NULL` value, +the encolosing operator will return `NULL`. However, there are a few exceptions listed in +[comparison operators](#Comparison_operators) and [logical operators](#Logical_operators). ### Arithmetic operators Arithemtic operators are used to exponentiate, add, subtract, multiply, and divide numeric values, or concatenate string values. @@ -293,7 +162,7 @@ | NULL | NULL | | MISSING | MISSING | -### Case expressions +## Case expressions CaseExpression ::= SimpleCaseExpression | SearchedCaseExpression SimpleCaseExpression ::= Expression ( Expression Expression )+ ( Expression )? @@ -308,16 +177,23 @@ CASE (2 < 3) WHEN true THEN "yes" ELSE "no" END -### Quantified expressions +## Quantified expressions QuantifiedExpression ::= ( (|) | ) Variable Expression ( "," Variable "in" Expression )* Expression ()? -Quantified expressions are used for expressing existential or universal predicates involving the elements of a collection. +Quantified expressions are used for expressing existential or universal predicates involving the elements of a +collection. -The following pair of examples illustrate the use of a quantified expression to test that every (or some) element in the set [1, 2, 3] of integers is less than three. The first example yields `FALSE` and second example yields `TRUE`. +The following pair of examples illustrate the use of a quantified expression to test that every (or some) element in the +set [1, 2, 3] of integers is less than three. The first example yields `FALSE` and second example yields `TRUE`. -It is useful to note that if the set were instead the empty set, the first expression would yield `TRUE` ("every" value in an empty set satisfies the condition) while the second expression would yield `FALSE` (since there isn't "some" value, as there are no values in the set, that satisfies the condition). +It is useful to note that if the set were instead the empty set, the first expression would yield `TRUE` ("every" value in an +empty set satisfies the condition) while the second expression would yield `FALSE` (since there isn't "some" value, as there are +no values in the set, that satisfies the condition). + +A quantified expression will return a `NULL` (or `MISSING`) if the first expression in it evaluates to `NULL` (or `MISSING`). +A type error will be raised if the first expression in a quantified expression does not return a collection. ##### Examples @@ -325,3 +201,190 @@ SOME x IN [ 1, 2, 3 ] SATISFIES x < 3 +## Path expressions + + PathExpression ::= PrimaryExpression ( Field | Index )* + Field ::= "." Identifier + Index ::= "[" ( Expression | "?" ) "]" + +Components of complex types in the data model are accessed via path expressions. Path access can be applied to the result +of a SQL++ expression that yields an instance of a complex type, e.g., a object or array instance. For objects, +path access is based on field names. For arrays, path access is based on (zero-based) array-style indexing. +SQL++ also supports an "I'm feeling lucky" style index accessor, [?], for selecting an arbitrary element from an array. +Attempts to access non-existent fields or out-of-bound array elements produce the special value `MISSING`. +Type errors will be raised for inappropriate use of a path expression, such as applying a field +accessor to a numeric value. + +The following examples illustrate field access for a object, index-based element access for an array, and also a +composition thereof. + +##### Examples + + ({"name": "MyABCs", "array": [ "a", "b", "c"]}).array + + (["a", "b", "c"])[2] + + ({"name": "MyABCs", "array": [ "a", "b", "c"]}).array[2] + + +## Primary Expressions + + PrimaryExpr ::= Literal + | VariableReference + | ParenthesizedExpression + | FunctionCallExpression + | Constructor + +The most basic building block for any SQL++ expression is PrimaryExpression. This can be a simple literal (constant) +value, a reference to a query variable that is in scope, a parenthesized expression, a function call, or a newly +constructed instance of the data model (such as a newly constructed object, array, or multiset of data model instances). + +### Literals + + Literal ::= StringLiteral + | IntegerLiteral + | FloatLiteral + | DoubleLiteral + | + | + | + | + StringLiteral ::= "\"" ( + + | + | + | + | + | + | + | + | ~["\"","\\"])* + "\"" + | "\'"( + + | + | + | + | + | + | + | + | ~["\'","\\"])* + "\'" + ::= "\\\'" + ::= "\\\"" + ::= "\\\\" + ::= "\\/" + ::= "\\b" + ::= "\\f" + ::= "\\n" + ::= "\\r" + ::= "\\t" + + IntegerLiteral ::= + ::= ["0" - "9"]+ + FloatLiteral ::= ( "f" | "F" ) + | ( "." ( "f" | "F" ) )? + | "." ( "f" | "F" ) + DoubleLiteral ::= + | ( "." )? + | "." + +Literals (constants) in SQL++ can be strings, integers, floating point values, double values, boolean constants, or special constant values like `NULL` and `MISSING`. The `NULL` value is like a `NULL` in SQL; it is used to represent an unknown field value. The specialy value `MISSING` is only meaningful in the context of SQL++ field accesses; it occurs when the accessed field simply does not exist at all in a object being accessed. + +The following are some simple examples of SQL++ literals. + +##### Examples + + 'a string' + "test string" + 42 + +Different from standard SQL, double quotes play the same role as single quotes and may be used for string literals in SQL++. + +### Variable References + + VariableReference ::= | + ::= ( | | "_" | "$")* + ::= ["A" - "Z", "a" - "z"] + DelimitedIdentifier ::= "`" ( + | + | + | + | + | + | + | + | ~["`","\\"])* + "`" + +A variable in SQL++ can be bound to any legal data model value. A variable reference refers to the value to which an in-scope variable is +bound. (E.g., a variable binding may originate from one of the `FROM`, `WITH` or `LET` clauses of a `SELECT` statement or from an +input parameter in the context of a function body.) Backticks, e.g., \`id\`, are used for delimited identifiers. Delimiting is needed when +a variable's desired name clashes with a SQL++ keyword or includes characters not allowed in regular identifiers. + +##### Examples + + tweet + id + `SELECT` + `my-function` + +### Parenthesized expressions + + ParenthesizedExpression ::= "(" Expression ")" | Subquery + +An expression can be parenthesized to control the precedence order or otherwise clarify a query. In SQL++, for composability, a subquery is also an parenthesized expression. + +The following expression evaluates to the value 2. + +##### Example + + ( 1 + 1 ) + +### Function call expressions + + FunctionCallExpression ::= FunctionName "(" ( Expression ( "," Expression )* )? ")" + +Functions are included in SQL++, like most languages, as a way to package useful functionality or to componentize complicated or reusable SQL++ computations. A function call is a legal SQL++ query expression that represents the value resulting from the evaluation of its body expression with the given parameter bindings; the parameter value bindings can themselves be any SQL++ expressions. + +The following example is a (built-in) function call expression whose value is 8. + +##### Example + + length('a string') + + +### Constructors + + Constructor ::= ArrayConstructor | MultisetConstructor | ObjectConstructor + ArrayConstructor ::= "[" ( Expression ( "," Expression )* )? "]" + MultisetConstructor ::= "{{" ( Expression ( "," Expression )* )? "}}" + ObjectConstructor ::= "{" ( FieldBinding ( "," FieldBinding )* )? "}" + FieldBinding ::= Expression ":" Expression + +A major feature of SQL++ is its ability to construct new data model instances. This is accomplished using +its constructors for each of the model's complex object structures, namely arrays, multisets, and objects. +Arrays are like JSON arrays, while multisets have bag semantics. +Objects are built from fields that are field-name/field-value pairs, again like JSON. + +The following examples illustrate how to construct a new array with 4 items, a new object with 2 fields, +and a new multiset with 5 items, respectively. Array elements or multiset elements can be homogeneous (as in +the first example), +which is the common case, or they may be heterogeneous (as in the third example). The data values and field name values +used to construct arrays, multisets, and objects in constructors are all simply SQL++ expressions. Thus, the collection +elements, field names, and field values used in constructors can be simple literals or they can come from query variable +references or even arbitrarily complex SQL++ expressions (subqueries). +Type errors will be raised if the field names in a record must be strings, and +duplicate field errors will be raised if they are not distinct. + +##### Examples + + [ 'a', 'b', 'c', 'c' ] + + { + 'project name': 'Hyracks', + 'project members': [ 'vinayakb', 'dtabass', 'chenli', 'tsotras', 'tillw' ] + } + + {{ 42, "forty-two!", { "rank": "Captain", "name": "America" }, 3.14159, 42 }} diff --git a/asterixdb/asterix-lang-sqlpp/src/main/javacc/SQLPP.jj b/asterixdb/asterix-lang-sqlpp/src/main/javacc/SQLPP.jj index 50682c9..a53ba99 100644 --- a/asterixdb/asterix-lang-sqlpp/src/main/javacc/SQLPP.jj +++ b/asterixdb/asterix-lang-sqlpp/src/main/javacc/SQLPP.jj @@ -3236,13 +3236,12 @@ TOKEN: { - < DOUBLE_LITERAL: - | ( "." )? - | "." + < DOUBLE_LITERAL: ( "." ) + | "." > - | < FLOAT_LITERAL: ( "f" | "F" ) - | ( "." ( "f" | "F" ) )? - | "." ( "f" | "F" ) + | < FLOAT_LITERAL: ( "f" | "F" ) + | ( "." ( "f" | "F" ) )? + | "." ( "f" | "F" ) > | )+ > } -- To view, visit https://asterix-gerrit.ics.uci.edu/1327 To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I224a706aa987a0d938ab22b9ae28660ef6433991 Gerrit-PatchSet: 1 Gerrit-Project: asterixdb Gerrit-Branch: master Gerrit-Owner: Yingyi Bu