asterixdb-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
Subject asterixdb git commit: SQL++ doc/grammar cleanup
Date Sat, 01 Oct 2016 00:08:36 GMT
Repository: asterixdb
Updated Branches:
  refs/heads/master 45c3304e7 -> f60282766

SQL++ doc/grammar cleanup

- remove comments that are addressed
- adapt grammar according to feedback

Change-Id: I6b4f5c7ae48c022a6b8f8c48b3927e1981b70598
Sonar-Qube: Jenkins <>
Tested-by: Jenkins <>
Reviewed-by: Yingyi Bu <>
Integration-Tests: Jenkins <>


Branch: refs/heads/master
Commit: f602827663f8b1b2438488ba85ae7782cd9bf0a0
Parents: 45c3304
Author: Till Westmann <>
Authored: Fri Sep 30 14:34:42 2016 -0700
Committer: Till Westmann <>
Committed: Fri Sep 30 17:07:14 2016 -0700

 .../src/main/markdown/sqlpp/           |  7 ++--
 .../src/main/markdown/sqlpp/          |  4 ---
 .../src/main/markdown/sqlpp/            | 36 ++------------------
 .../asterix-lang-sqlpp/src/main/javacc/SQLPP.jj |  7 ++--
 4 files changed, 7 insertions(+), 47 deletions(-)
diff --git a/asterixdb/asterix-doc/src/main/markdown/sqlpp/ b/asterixdb/asterix-doc/src/main/markdown/sqlpp/
index 0e834e6..79f9da0 100644
--- a/asterixdb/asterix-doc/src/main/markdown/sqlpp/
+++ b/asterixdb/asterix-doc/src/main/markdown/sqlpp/
@@ -25,8 +25,9 @@ The most basic building block for any SQL++ expression is PrimaryExpression.
                        | <TRUE>
                        | <FALSE>
     StringLiteral  ::= "\'" (<ESCAPE_APOS> | ~["\'"])* "\'"
-                       | "\"" (<ESCAPE_APOS> | ~["\'"])* "\""
+                       | "\"" (<ESCAPE_QUOT> | ~["\'"])* "\""
     <ESCAPE_APOS>  ::= "\\\'"
+    <ESCAPE_QUOT>  ::= "\\\""
     IntegerLiteral ::= <DIGITS>
     <DIGITS>       ::= ["0" - "9"]+
     FloatLiteral   ::= <DIGITS> ( "f" | "F" )
@@ -36,10 +37,6 @@ The most basic building block for any SQL++ expression is PrimaryExpression.
                      | <DIGITS> ( "." <DIGITS> )?
                      | "." <DIGITS>
-> MC: I tentatively deleted the following unused ESCAPE_QUOTE definition: &lt;ESCAPE_QUOT&gt;
 ::= "\\\""
-> 		&lt;ESCAPE_QUOT&gt;  ::= "\\\""
-> Also, I moved the DelimitedIdentifier down further per TW's suggestion.
 Literals (constants) in SQL++ can be strings, integers, floating point values, double values,
boolean constants, or special constant values like `NULL` and `MISSING`. The `NULL` value
is like a `NULL` in SQL; it is used to represent an unknown field value. The specialy value
`MISSING` is only meaningful in the context of SQL++ field accesses; it occurs when the accessed
field simply does not exist at all in a record being accessed.
 The following are some simple examples of SQL++ literals.
diff --git a/asterixdb/asterix-doc/src/main/markdown/sqlpp/ b/asterixdb/asterix-doc/src/main/markdown/sqlpp/
index 04d8e5b..66fd8f1 100644
--- a/asterixdb/asterix-doc/src/main/markdown/sqlpp/
+++ b/asterixdb/asterix-doc/src/main/markdown/sqlpp/
@@ -8,8 +8,6 @@ A SQL++ query can be any legal SQL++ expression or `SELECT` statement. A SQL++
 The following shows the (rich) grammar for the `SELECT` statement in SQL++.
-> TW: Should we replace SelectElement with SelectValue? MC: Yes, and done below.
     SelectStatement    ::= ( WithClause )?
                            SelectSetOperation (OrderbyClause )? ( LimitClause )?
     SelectSetOperation ::= SelectBlock (<UNION> <ALL> ( SelectBlock | Subquery
) )*
@@ -642,8 +640,6 @@ will rewrite as follows:
     GROUP BY msg.authorId AS uid GROUP AS `$1`(msg AS msg);
 > TW: We really need to do something about `COLL_SQL-COUNT`.
-> MC: You mean about its name? And inconsistent dashing? I agree...!  :-)
-> Also, do we need to say anything about the (mandatory) double parens here?
 The same sort of rewritings apply to the function symbols `SUM`, `MAX`, `MIN`, and `AVG`.
 In contrast to the SQL++ collection aggregate functions, these special SQL-92 function symbols
diff --git a/asterixdb/asterix-doc/src/main/markdown/sqlpp/ b/asterixdb/asterix-doc/src/main/markdown/sqlpp/
index 9dc6947..e83e47d 100644
--- a/asterixdb/asterix-doc/src/main/markdown/sqlpp/
+++ b/asterixdb/asterix-doc/src/main/markdown/sqlpp/
@@ -15,9 +15,6 @@ In addition to queries, the AsterixDB implementation of SQL++ supports statement
 manipulation purposes as well as controlling the context to be used in evaluating SQL++ expressions.
 This section details the DDL and DML statements supported in the SQL++ language as realized
in Apache AsterixDB.
-> TW: AsterixDB?
-> MC: Good question here - I eradicated the preceding references except in the Intro,
which needs a rewrite, but here it is really still about AsterixDB, I think?  (Since most
of these statements will be hidden in the Couchbase case?)
 ## <a id="Declarations">Declarations</a>
     DatabaseDeclaration ::= "USE" Identifier
@@ -55,7 +52,6 @@ For our sample data set, this returns:
       { "id": 2, "name": "IsbelDull", "friendCount": 2 }
 ## <a id="Lifecycle_management_statements">Lifecycle management statements</a>
@@ -74,17 +70,12 @@ It can be used to create new dataverses, datatypes, datasets, indexes,
and user-
 ### <a id="Dataverses"> Dataverses</a>
-    DatabaseSpecification ::= "DATAVERSE" Identifier IfNotExists ( "WITH" "FORMAT" StringLiteral
+    DatabaseSpecification ::= "DATAVERSE" Identifier IfNotExists
 The CREATE DATAVERSE statement is used to create new dataverses.
 To ease the authoring of reusable SQL++ scripts, an optional IF NOT EXISTS clause is included
to allow
 creation to be requested either unconditionally or only if the dataverse does not already
 If this clause is absent, an error is returned if a dataverse with the indicated name already
-(Note: The `WITH FORMAT` clause in the syntax above is a placeholder for possible `future
-that can safely be ignored here.)
-> MC: Should we get rid of WITH FORMAT? (I think we should - here and in the system -
if we ever do it
-I would actually expect it to be more fine-grained than the dataverse level.)
 The following example creates a new dataverse named TinySocial if one does not already exist.
@@ -94,7 +85,7 @@ The following example creates a new dataverse named TinySocial if one does
not a
 ### <a id="Types"> Types</a>
-    TypeSpecification    ::= "TYPE" FunctionOrTypeName IfNotExists "AS" TypeExpr
+    TypeSpecification    ::= "TYPE" FunctionOrTypeName IfNotExists "AS" RecordTypeDef
     FunctionOrTypeName   ::= QualifiedName
     IfNotExists          ::= ( <IF> <NOT> <EXISTS> )?
     TypeExpr             ::= RecordTypeDef | TypeReference | OrderedListTypeDef | UnorderedListTypeDef
@@ -106,9 +97,6 @@ The following example creates a new dataverse named TinySocial if one does
not a
     OrderedListTypeDef   ::= "[" ( TypeExpr ) "]"
     UnorderedListTypeDef ::= "{{" ( TypeExpr ) "}}"
-> TW: How should we refer to the data model? "Asterix Data Model" seems system specific.
-> MC: Agreed that this is an issue. Let's first decide and I can handle the issue in a
later pass.
 The CREATE TYPE statement is used to create a new named ADM datatype.
 This type can then be used to create stored collections or utilized when defining one or
more other ADM datatypes.
 Much more information about the Asterix Data Model (ADM) is available in the [data model
reference guide](datamodel.html) to ADM.
@@ -117,8 +105,6 @@ A record type can be defined as being either open or closed.
 Instances of a closed record type are not permitted to contain fields other than those specified
in the create type statement.
 Instances of an open record type may carry additional fields, and open is the default for
new types if neither option is specified.
-> MC: I had forgotten about options other than using CREATE TYPE to introduce new record
types! (Are all of the other AS TypeExpr possibilities actually well-tested?)
 The following example creates a new ADM record type called GleambookUser type.
 Since it is defined as (defaulting to) being an open type,
 instances will be permitted to contain more than what is specified in the type definition.
@@ -171,11 +157,6 @@ This field type can be used if you want to have this field be an autogenerated-P
     PrimaryKey           ::= <PRIMARY> <KEY> NestedField ( "," NestedField )*
     CompactionPolicy     ::= Identifier
-> TW: Again, a lot of AsterixDB in the following paragraph.
-> Also, while I'm sure that this was always like this, the separation of `Configuration`
-> from `Properties` looks pretty confusing ...
-> MC: Not sure what we should do about all this, actually! (I don't disagree. New JSON
syntax coming, too?)
 The CREATE DATASET statement is used to create a new dataset.
 Datasets are named, unordered collections of ADM record type instances;
 they are where data lives persistently and are the usual targets for SQL++ queries.
@@ -188,9 +169,6 @@ Internal datasets contain several advanced options that can be specified
when ap
 One such option is that random primary key (UUID) values can be auto-generated by declaring
the field to be UUID and putting "AUTOGENERATED" after the "PRIMARY KEY" identifier.
 In this case, unlike other non-optional fields, a value for the auto-generated PK field should
not be provided at insertion time by the user since each record's primary key field value
will be auto-generated by the system.
-> TW: "The Filter-Based LSM Index Acceleration" seems to be quite system specific ...
-> MC: Indeed, but that is always inescapable in DDL reference manuals, no? (We have to
decide what to say where. :-))
 Another advanced option, when creating an Internal dataset, is to specify the merge policy
to control which of the
 underlying LSM storage components to be merged.
 (AsterixDB supports Log-Structured Merge tree based physical storage for Internal datasets.)
@@ -268,8 +246,6 @@ specified at the end of the index definition.
 `ENFORCING` an open field introduces a check that makes sure that the actual type of the
indexed field
 (if the optional field exists in the record) always matches this specified (open) field type.
-*Editor's note: The ? shown above after the type is intended to be mandatory, and we need
to make that happen.*
 The following example creates a btree index called gbAuthorIdx on the authorId field of the
GleambookMessages dataset.
 This index can be useful for accelerating exact-match queries, range search queries, and
joins involving the author-id
@@ -285,8 +261,6 @@ This index can be useful for accelerating exact-match queries, range search
     CREATE INDEX gbSendTimeIdx ON GleambookMessages(sendTime: datetime?) TYPE BTREE ENFORCED;
-> MC: The above works in my branch (with ? mandatory) but not in the main branch. We need
to change that. :-)
 The following example creates a btree index called crpUserScrNameIdx on screenName,
 a nested field residing within a record-valued user field in the ChirpMessages dataset.
 This index can be useful for accelerating exact-match queries, range search queries,
@@ -389,10 +363,6 @@ The following example shows how to bulk load the GleambookUsers dataset
from an
     InsertStatement ::= <INSERT> <INTO> QualifiedName Query
-> TW: AsterixDB-specifc transactions semantics ...
-> Also, do we also support `UPSERT`?
-> MC: Yes to both. :-) Whoops. Wait, maybe not. We do have upsert in AQL, but not in SQL++
today, it seems. I'll document it anyway...? :-)
 The SQL++ INSERT statement is used to insert new data into a dataset.
 The data to be inserted comes from a SQL++ query expression.
 This expression can be as simple as a constant expression, or in general it can be any legal
SQL++ query.
@@ -430,7 +400,7 @@ The following example illustrates a query-based upsert operation.
 ### <a id="Deletes">DELETEs</a>
-    DeleteStatement ::= <DELETE> <FROM> QualifiedName ( (<AS>)? Variable
)? ( <WHERE> Expression )?
+    DeleteStatement ::= <DELETE> <FROM> QualifiedName ( ( <AS> )? Variable
)? ( <WHERE> Expression )?
 The SQL++ DELETE statement is used to delete data from a target dataset.
 The data to be deleted is identified by a boolean expression involving the variable bound
to the target dataset in the DELETE statement.
diff --git a/asterixdb/asterix-lang-sqlpp/src/main/javacc/SQLPP.jj b/asterixdb/asterix-lang-sqlpp/src/main/javacc/SQLPP.jj
index f330f40..6c4bc5c 100644
--- a/asterixdb/asterix-lang-sqlpp/src/main/javacc/SQLPP.jj
+++ b/asterixdb/asterix-lang-sqlpp/src/main/javacc/SQLPP.jj
@@ -414,7 +414,7 @@ TypeDecl TypeSpecification(String hint, boolean dgen) throws ParseException:
   <TYPE> nameComponents = TypeName() ifNotExists = IfNotExists()
-  <AS> typeExpr = TypeExpr()
+  <AS> typeExpr = RecordTypeDef()
       long numValues = -1;
       String filename = null;
@@ -683,14 +683,12 @@ CreateDataverseStatement DataverseSpecification() throws ParseException
   String dvName = null;
   boolean ifNotExists = false;
-  String format = null;
   <DATAVERSE> dvName = Identifier()
   ifNotExists = IfNotExists()
-  ( LOOKAHEAD(1) <WITH> <FORMAT> format = ConstantString() )?
-      return new CreateDataverseStatement(new Identifier(dvName), format, ifNotExists);
+      return new CreateDataverseStatement(new Identifier(dvName), null, ifNotExists);
@@ -3086,7 +3084,6 @@ TOKEN [IGNORE_CASE]:
   | <FILTER : "filter">
   | <FLATTEN : "flatten">
   | <FOR : "for">
-  | <FORMAT : "format">
   | <FROM : "from">
   | <FULL : "full">
   | <FUNCTION : "function">

View raw message