Yingyi Bu has submitted this change and it was merged.
Change subject: Address Don's comments in the expression doc.
......................................................................
Address Don's comments in the expression doc.
ChangeId: I224a706aa987a0d938ab22b9ae28660ef6433991
Reviewedon: https://asterixgerrit.ics.uci.edu/1327
SonarQube: Jenkins <jenkins@fulliautomatix.ics.uci.edu>
Testedby: Jenkins <jenkins@fulliautomatix.ics.uci.edu>
IntegrationTests: Jenkins <jenkins@fulliautomatix.ics.uci.edu>
Reviewedby: Yingyi Bu <buyingyi@gmail.com>

M asterixdb/asterixdoc/src/main/markdown/sqlpp/0_toc.md
M asterixdb/asterixdoc/src/main/markdown/sqlpp/2_expr.md
M asterixdb/asterixlangsqlpp/src/main/javacc/SQLPP.jj
3 files changed, 224 insertions(+), 162 deletions()
Approvals:
Yingyi Bu: Looks good to me, approved
Jenkins: Verified; No violations found; Verified
diff git a/asterixdb/asterixdoc/src/main/markdown/sqlpp/0_toc.md b/asterixdb/asterixdoc/src/main/markdown/sqlpp/0_toc.md
index b04ea6a..ff31357 100644
 a/asterixdb/asterixdoc/src/main/markdown/sqlpp/0_toc.md
+++ b/asterixdb/asterixdoc/src/main/markdown/sqlpp/0_toc.md
@@ 23,13 +23,6 @@
* [1. Introduction](#Introduction)
* [2. Expressions](#Expressions)
 * [Primary expressions](#Primary_expressions)
 * [Literals](#Literals)
 * [Variable references](#Variable_references)
 * [Parenthesized expressions](#Parenthesized_expressions)
 * [Function call expressions](#Function_call_expressions)
 * [Constructors](#Constructors)
 * [Path expressions](#Path_expressions)
* [Operator expressions](#Operator_expressions)
* [Arithmetic operators](#Arithmetic_operators)
* [Collection operators](#Collection_operators)
@@ 37,6 +30,13 @@
* [Logical operators](#Logical_operators)
* [Case expressions](#Case_expressions)
* [Quantified expressions](#Quantified_expressions)
+ * [Path expressions](#Path_expressions)
+ * [Primary expressions](#Primary_expressions)
+ * [Literals](#Literals)
+ * [Variable references](#Variable_references)
+ * [Parenthesized expressions](#Parenthesized_expressions)
+ * [Function call expressions](#Function_call_expressions)
+ * [Constructors](#Constructors)
* [3. Queries](#Queries)
* [SELECT statements](#SELECT_statements)
* [SELECT clauses](#Select_clauses)
diff git a/asterixdb/asterixdoc/src/main/markdown/sqlpp/2_expr.md b/asterixdb/asterixdoc/src/main/markdown/sqlpp/2_expr.md
index 17cf9bf..f3d4311 100644
 a/asterixdb/asterixdoc/src/main/markdown/sqlpp/2_expr.md
+++ b/asterixdb/asterixdoc/src/main/markdown/sqlpp/2_expr.md
@@ 21,152 +21,16 @@
Expression ::= OperatorExpression  CaseExpression  QuantifiedExpression
SQL++ is a highly composable expression language. Each SQL++ expression returns zero or more
data model instances. There are three major kinds of expressions in SQL++. At the topmost
level, a SQL++ expression can be an OperatorExpression (similar to a mathematical expression),
an ConditionalExpression (to choose between alternative values), or a QuantifiedExpression
(which yields a boolean value). Each will be detailed as we explore the full SQL++ grammar.
+SQL++ is a highly composable expression language. Each SQL++ expression returns zero or more
data model instances.
+There are three major kinds of expressions in SQL++. At the topmost level, a SQL++ expression
can be an
+OperatorExpression (similar to a mathematical expression), an ConditionalExpression (to choose
between
+alternative values), or a QuantifiedExpression (which yields a boolean value). Each will
be detailed as we
+explore the full SQL++ grammar.
## <a id="Primary_expressions">Primary Expressions</a>
+Note that in the following text, words enclosed in angle brackets denote keywords that are
not casesensitive.
 PrimaryExpr ::= Literal
  VariableReference
  ParenthesizedExpression
  FunctionCallExpression
  Constructor
The most basic building block for any SQL++ expression is PrimaryExpression. This can be
a simple literal (constant)
value, a reference to a query variable that is in scope, a parenthesized expression, a function
call, or a newly
constructed instance of the data model (such as a newly constructed object, array, or multiset
of data model instances).

### <a id="Literals">Literals</a>

 Literal ::= StringLiteral
  IntegerLiteral
  FloatLiteral
  DoubleLiteral
  <NULL>
  <MISSING>
  <TRUE>
  <FALSE>
 StringLiteral ::= "\'" (<ESCAPE_APOS>  ~["\'"])* "\'"
  "\"" (<ESCAPE_QUOT>  ~["\'"])* "\""
 <ESCAPE_APOS> ::= "\\\'"
 <ESCAPE_QUOT> ::= "\\\""
 IntegerLiteral ::= <DIGITS>
 <DIGITS> ::= ["0"  "9"]+
 FloatLiteral ::= <DIGITS> ( "f"  "F" )
  <DIGITS> ( "." <DIGITS> ( "f"  "F" ) )?
  "." <DIGITS> ( "f"  "F" )
 DoubleLiteral ::= <DIGITS>
  <DIGITS> ( "." <DIGITS> )?
  "." <DIGITS>

Literals (constants) in SQL++ can be strings, integers, floating point values, double values,
boolean constants, or special constant values like `NULL` and `MISSING`. The `NULL` value
is like a `NULL` in SQL; it is used to represent an unknown field value. The specialy value
`MISSING` is only meaningful in the context of SQL++ field accesses; it occurs when the accessed
field simply does not exist at all in a object being accessed.

The following are some simple examples of SQL++ literals.

##### Examples

 'a string'
 "test string"
 42

Different from standard SQL, double quotes play the same role as single quotes and may be
used for string literals in SQL++.

### <a id="Variable_references">Variable References</a>

 VariableReference ::= <IDENTIFIER><DelimitedIdentifier>
 <IDENTIFIER> ::= <LETTER> (<LETTER>  <DIGIT>  "_"
 "$")*
 <LETTER> ::= ["A"  "Z", "a"  "z"]
 DelimitedIdentifier ::= "\`" (<ESCAPE_APOS>  ~["\'"])* "\`"

A variable in SQL++ can be bound to any legal data model value. A variable reference refers
to the value to which an inscope variable is bound. (E.g., a variable binding may originate
from one of the `FROM`, `WITH` or `LET` clauses of a `SELECT` statement or from an input parameter
in the context of a function body.) Backticks, e.g., \`id\`, are used for delimited identifiers.
Delimiting is needed when a variable's desired name clashes with a SQL++ keyword or includes
characters not allowed in regular identifiers.

##### Examples

 tweet
 id
 `SELECT`
 `myfunction`

### <a id="Parenthesized_expressions">Parenthesized expressions</a>

 ParenthesizedExpression ::= "(" Expression ")"  Subquery

An expression can be parenthesized to control the precedence order or otherwise clarify a
query. In SQL++, for composability, a subquery is also an parenthesized expression.

The following expression evaluates to the value 2.

##### Example

 ( 1 + 1 )

### <a id="Function_call_expressions">Function call expressions</a>

 FunctionCallExpression ::= FunctionName "(" ( Expression ( "," Expression )* )? ")"

Functions are included in SQL++, like most languages, as a way to package useful functionality
or to componentize complicated or reusable SQL++ computations. A function call is a legal
SQL++ query expression that represents the value resulting from the evaluation of its body
expression with the given parameter bindings; the parameter value bindings can themselves
be any SQL++ expressions.

The following example is a (builtin) function call expression whose value is 8.

##### Example

 length('a string')

### <a id="Constructors">Constructors</a>

 CollectionConstructor ::= ArrayConstructor  MultisetConstructor
 ArrayConstructor ::= "[" ( Expression ( "," Expression )* )? "]"
 MultisetConstructor ::= "{{" ( Expression ( "," Expression )* )? "}}"
 ObjectConstructor ::= "{" ( FieldBinding ( "," FieldBinding )* )? "}"
 FieldBinding ::= Expression ":" Expression

A major feature of SQL++ is its ability to construct new data model instances. This is accomplished
using its constructors
for each of the model's complex object structures, namely arrays, multisets, and objects.
Arrays are like JSON arrays, while multisets have bag semantics.
Objects are built from fields that are fieldname/fieldvalue pairs, again like JSON.
(See the [data model document](../datamodel.html) for more details on each.)

The following examples illustrate how to construct a new array with 3 items, a new object
with 2 fields,
and a new multiset with 4 items, respectively. Array elements or multiset elements can be
homogeneous (as in
the first example),
which is the common case, or they may be heterogeneous (as in the third example). The data
values and field name values
used to construct arrays, multisets, and objects in constructors are all simply SQL++ expressions.
Thus, the collection elements,
field names, and field values used in constructors can be simple literals or they can come
from query variable references
or even arbitrarily complex SQL++ expressions (subqueries).

##### Examples

 [ 'a', 'b', 'c' ]

 {
 'project name': 'Hyracks',
 'project members': [ 'vinayakb', 'dtabass', 'chenli', 'tsotras', 'tillw' ]
 }

 {{ 42, "fortytwo!", { "rank": "Captain", "name": "America" }, 3.14159 }}

### <a id="Path_expressions">Path expressions</a>

 PathExpression ::= PrimaryExpression ( Field  Index )*
 Field ::= "." Identifier
 Index ::= "[" ( Expression  "?" ) "]"

Components of complex types in the data model are accessed via path expressions. Path access
can be applied to the result
of a SQL++ expression that yields an instance of a complex type, e.g., a object or array
instance. For objects,
path access is based on field names. For arrays, path access is based on (zerobased) arraystyle
indexing.
SQL++ also supports an "I'm feeling lucky" style index accessor, [?], for selecting an arbitrary
element from an array.
 Attempts to access nonexistent fields or outofbound array elements produce the special
value `MISSING`.

The following examples illustrate field access for a object, indexbased element access for
an array, and also a
composition thereof.

##### Examples

 ({"name": "MyABCs", "array": [ "a", "b", "c"]}).array

 (["a", "b", "c"])[2]

 ({"name": "MyABCs", "array": [ "a", "b", "c"]}).array[2]

### <a id="Operator_expressions">Operator expressions</a>
+## <a id="Operator_expressions">Operator expressions</a>
Operators perform a specific operation on the input values or expressions. The syntax of
an operator expression is as follows:
@@ 188,7 +52,7 @@

 EXISTS, NOT EXISTS  collection
emptiness testing 
 ^  exponentiation

 *, /  multiplication,
division 
+ *, /, %  multiplication,
division, modulo 
 +,   addition,
subtraction 
 
 string concatenation 
 IS NULL, IS NOT NULL, IS MISSING, IS NOT MISSING, <br/>IS UNKNOWN, IS NOT UNKNOWN
unknown value comparison 
@@ 197,6 +61,11 @@
 NOT  logical negation

 AND  conjunction

 OR  disjunction

+
+In general, if any operand evaluates to a `MISSING` value, the enclosing operator will return
`MISSING`;
+if none of operands evaluates to a `MISSING` value but there is an operand evaluates to a
`NULL` value,
+the encolosing operator will return `NULL`. However, there are a few exceptions listed in
+[comparison operators](#Comparison_operators) and [logical operators](#Logical_operators).
### <a id="Arithmetic_operators">Arithmetic operators</a>
Arithemtic operators are used to exponentiate, add, subtract, multiply, and divide numeric
values, or concatenate string values.
@@ 293,7 +162,7 @@
 NULL  NULL 
 MISSING  MISSING 
### <a id="Case_expressions">Case expressions</a>
+## <a id="Case_expressions">Case expressions</a>
CaseExpression ::= SimpleCaseExpression  SearchedCaseExpression
SimpleCaseExpression ::= <CASE> Expression ( <WHEN> Expression <THEN>
Expression )+ ( <ELSE> Expression )? <END>
@@ 308,16 +177,23 @@
CASE (2 < 3) WHEN true THEN "yes" ELSE "no" END
### <a id="Quantified_expressions">Quantified expressions</a>
+## <a id="Quantified_expressions">Quantified expressions</a>
QuantifiedExpression ::= ( (<ANY><SOME>)  <EVERY> ) Variable <IN>
Expression ( "," Variable "in" Expression )*
<SATISFIES> Expression (<END>)?
Quantified expressions are used for expressing existential or universal predicates involving
the elements of a collection.
+Quantified expressions are used for expressing existential or universal predicates involving
the elements of a
+collection.
The following pair of examples illustrate the use of a quantified expression to test that
every (or some) element in the set [1, 2, 3] of integers is less than three. The first example
yields `FALSE` and second example yields `TRUE`.
+The following pair of examples illustrate the use of a quantified expression to test that
every (or some) element in the
+set [1, 2, 3] of integers is less than three. The first example yields `FALSE` and second
example yields `TRUE`.
It is useful to note that if the set were instead the empty set, the first expression would
yield `TRUE` ("every" value in an empty set satisfies the condition) while the second expression
would yield `FALSE` (since there isn't "some" value, as there are no values in the set, that
satisfies the condition).
+It is useful to note that if the set were instead the empty set, the first expression would
yield `TRUE` ("every" value in an
+empty set satisfies the condition) while the second expression would yield `FALSE` (since
there isn't "some" value, as there are
+no values in the set, that satisfies the condition).
+
+A quantified expression will return a `NULL` (or `MISSING`) if the first expression in it
evaluates to `NULL` (or `MISSING`).
+A type error will be raised if the first expression in a quantified expression does not return
a collection.
##### Examples
@@ 325,3 +201,190 @@
SOME x IN [ 1, 2, 3 ] SATISFIES x < 3
+## <a id="Path_expressions">Path expressions</a>
+
+ PathExpression ::= PrimaryExpression ( Field  Index )*
+ Field ::= "." Identifier
+ Index ::= "[" ( Expression  "?" ) "]"
+
+Components of complex types in the data model are accessed via path expressions. Path access
can be applied to the result
+of a SQL++ expression that yields an instance of a complex type, e.g., a object or array
instance. For objects,
+path access is based on field names. For arrays, path access is based on (zerobased) arraystyle
indexing.
+SQL++ also supports an "I'm feeling lucky" style index accessor, [?], for selecting an arbitrary
element from an array.
+Attempts to access nonexistent fields or outofbound array elements produce the special
value `MISSING`.
+Type errors will be raised for inappropriate use of a path expression, such as applying a
field
+accessor to a numeric value.
+
+The following examples illustrate field access for a object, indexbased element access for
an array, and also a
+composition thereof.
+
+##### Examples
+
+ ({"name": "MyABCs", "array": [ "a", "b", "c"]}).array
+
+ (["a", "b", "c"])[2]
+
+ ({"name": "MyABCs", "array": [ "a", "b", "c"]}).array[2]
+
+
+## <a id="Primary_expressions">Primary Expressions</a>
+
+ PrimaryExpr ::= Literal
+  VariableReference
+  ParenthesizedExpression
+  FunctionCallExpression
+  Constructor
+
+The most basic building block for any SQL++ expression is PrimaryExpression. This can be
a simple literal (constant)
+value, a reference to a query variable that is in scope, a parenthesized expression, a function
call, or a newly
+constructed instance of the data model (such as a newly constructed object, array, or multiset
of data model instances).
+
+### <a id="Literals">Literals</a>
+
+ Literal ::= StringLiteral
+  IntegerLiteral
+  FloatLiteral
+  DoubleLiteral
+  <NULL>
+  <MISSING>
+  <TRUE>
+  <FALSE>
+ StringLiteral ::= "\"" (
+ <EscapeQuot>
+  <EscapeBslash>
+  <EscapeSlash>
+  <EscapeBspace>
+  <EscapeFormf>
+  <EscapeNl>
+  <EscapeCr>
+  <EscapeTab>
+  ~["\"","\\"])*
+ "\""
+  "\'"(
+ <EscapeApos>
+  <EscapeBslash>
+  <EscapeSlash>
+  <EscapeBspace>
+  <EscapeFormf>
+  <EscapeNl>
+  <EscapeCr>
+  <EscapeTab>
+  ~["\'","\\"])*
+ "\'"
+ <ESCAPE_Apos> ::= "\\\'"
+ <ESCAPE_Quot> ::= "\\\""
+ <EscapeBslash> ::= "\\\\"
+ <EscapeSlash> ::= "\\/"
+ <EscapeBspace> ::= "\\b"
+ <EscapeFormf> ::= "\\f"
+ <EscapeNl> ::= "\\n"
+ <EscapeCr> ::= "\\r"
+ <EscapeTab> ::= "\\t"
+
+ IntegerLiteral ::= <DIGITS>
+ <DIGITS> ::= ["0"  "9"]+
+ FloatLiteral ::= <DIGITS> ( "f"  "F" )
+  <DIGITS> ( "." <DIGITS> ( "f"  "F" ) )?
+  "." <DIGITS> ( "f"  "F" )
+ DoubleLiteral ::= <DIGITS>
+  <DIGITS> ( "." <DIGITS> )?
+  "." <DIGITS>
+
+Literals (constants) in SQL++ can be strings, integers, floating point values, double values,
boolean constants, or special constant values like `NULL` and `MISSING`. The `NULL` value
is like a `NULL` in SQL; it is used to represent an unknown field value. The specialy value
`MISSING` is only meaningful in the context of SQL++ field accesses; it occurs when the accessed
field simply does not exist at all in a object being accessed.
+
+The following are some simple examples of SQL++ literals.
+
+##### Examples
+
+ 'a string'
+ "test string"
+ 42
+
+Different from standard SQL, double quotes play the same role as single quotes and may be
used for string literals in SQL++.
+
+### <a id="Variable_references">Variable References</a>
+
+ VariableReference ::= <IDENTIFIER><DelimitedIdentifier>
+ <IDENTIFIER> ::= <LETTER> (<LETTER>  <DIGIT>  "_"
 "$")*
+ <LETTER> ::= ["A"  "Z", "a"  "z"]
+ DelimitedIdentifier ::= "`" (<EscapeQuot>
+  <EscapeBslash>
+  <EscapeSlash>
+  <EscapeBspace>
+  <EscapeFormf>
+  <EscapeNl>
+  <EscapeCr>
+  <EscapeTab>
+  ~["`","\\"])*
+ "`"
+
+A variable in SQL++ can be bound to any legal data model value. A variable reference refers
to the value to which an inscope variable is
+bound. (E.g., a variable binding may originate from one of the `FROM`, `WITH` or `LET` clauses
of a `SELECT` statement or from an
+input parameter in the context of a function body.) Backticks, e.g., \`id\`, are used for
delimited identifiers. Delimiting is needed when
+a variable's desired name clashes with a SQL++ keyword or includes characters not allowed
in regular identifiers.
+
+##### Examples
+
+ tweet
+ id
+ `SELECT`
+ `myfunction`
+
+### <a id="Parenthesized_expressions">Parenthesized expressions</a>
+
+ ParenthesizedExpression ::= "(" Expression ")"  Subquery
+
+An expression can be parenthesized to control the precedence order or otherwise clarify a
query. In SQL++, for composability, a subquery is also an parenthesized expression.
+
+The following expression evaluates to the value 2.
+
+##### Example
+
+ ( 1 + 1 )
+
+### <a id="Function_call_expressions">Function call expressions</a>
+
+ FunctionCallExpression ::= FunctionName "(" ( Expression ( "," Expression )* )? ")"
+
+Functions are included in SQL++, like most languages, as a way to package useful functionality
or to componentize complicated or reusable SQL++ computations. A function call is a legal
SQL++ query expression that represents the value resulting from the evaluation of its body
expression with the given parameter bindings; the parameter value bindings can themselves
be any SQL++ expressions.
+
+The following example is a (builtin) function call expression whose value is 8.
+
+##### Example
+
+ length('a string')
+
+
+### <a id="Constructors">Constructors</a>
+
+ Constructor ::= ArrayConstructor  MultisetConstructor  ObjectConstructor
+ ArrayConstructor ::= "[" ( Expression ( "," Expression )* )? "]"
+ MultisetConstructor ::= "{{" ( Expression ( "," Expression )* )? "}}"
+ ObjectConstructor ::= "{" ( FieldBinding ( "," FieldBinding )* )? "}"
+ FieldBinding ::= Expression ":" Expression
+
+A major feature of SQL++ is its ability to construct new data model instances. This is accomplished
using
+its constructors for each of the model's complex object structures, namely arrays, multisets,
and objects.
+Arrays are like JSON arrays, while multisets have bag semantics.
+Objects are built from fields that are fieldname/fieldvalue pairs, again like JSON.
+
+The following examples illustrate how to construct a new array with 4 items, a new object
with 2 fields,
+and a new multiset with 5 items, respectively. Array elements or multiset elements can be
homogeneous (as in
+the first example),
+which is the common case, or they may be heterogeneous (as in the third example). The data
values and field name values
+used to construct arrays, multisets, and objects in constructors are all simply SQL++ expressions.
Thus, the collection
+elements, field names, and field values used in constructors can be simple literals or they
can come from query variable
+references or even arbitrarily complex SQL++ expressions (subqueries).
+Type errors will be raised if the field names in a record must be strings, and
+duplicate field errors will be raised if they are not distinct.
+
+##### Examples
+
+ [ 'a', 'b', 'c', 'c' ]
+
+ {
+ 'project name': 'Hyracks',
+ 'project members': [ 'vinayakb', 'dtabass', 'chenli', 'tsotras', 'tillw' ]
+ }
+
+ {{ 42, "fortytwo!", { "rank": "Captain", "name": "America" }, 3.14159, 42 }}
diff git a/asterixdb/asterixlangsqlpp/src/main/javacc/SQLPP.jj b/asterixdb/asterixlangsqlpp/src/main/javacc/SQLPP.jj
index f9fa444..ac6083d 100644
 a/asterixdb/asterixlangsqlpp/src/main/javacc/SQLPP.jj
+++ b/asterixdb/asterixlangsqlpp/src/main/javacc/SQLPP.jj
@@ 3223,13 +3223,12 @@
<DEFAULT,IN_DBL_BRACE>
TOKEN:
{
 < DOUBLE_LITERAL: <DIGITS>
  <DIGITS> ( "." <DIGITS> )?
  "." <DIGITS>
+ < DOUBLE_LITERAL: <DIGITS> ( "." <DIGITS> )
+  "." <DIGITS>
>
  < FLOAT_LITERAL: <DIGITS> ( "f"  "F" )
  <DIGITS> ( "." <DIGITS> ( "f"  "F" ) )?
  "." <DIGITS> ( "f"  "F" )
+  < FLOAT_LITERAL: <DIGITS> ( "f"  "F" )
+  <DIGITS> ( "." <DIGITS> ( "f"  "F" ) )?
+  "." <DIGITS> ( "f"  "F" )
>
 <DIGITS : (<DIGIT>)+ >
}

To view, visit https://asterixgerrit.ics.uci.edu/1327
To unsubscribe, visit https://asterixgerrit.ics.uci.edu/settings
GerritMessageType: merged
GerritChangeId: I224a706aa987a0d938ab22b9ae28660ef6433991
GerritPatchSet: 3
GerritProject: asterixdb
GerritBranch: master
GerritOwner: Yingyi Bu <buyingyi@gmail.com>
GerritReviewer: Jenkins <jenkins@fulliautomatix.ics.uci.edu>
GerritReviewer: Till Westmann <tillw@apache.org>
GerritReviewer: Yingyi Bu <buyingyi@gmail.com>
