asterixdb-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Till Westmann (Code Review)" <>
Subject Change in asterixdb[master]: SQL++ doc/grammar cleanup
Date Fri, 30 Sep 2016 21:35:39 GMT
Till Westmann has uploaded a new change for review.

Change subject: SQL++ doc/grammar cleanup

SQL++ doc/grammar cleanup

- remove comments that are addressed
- adapt grammar according to feedback

Change-Id: I6b4f5c7ae48c022a6b8f8c48b3927e1981b70598
M asterixdb/asterix-doc/src/main/markdown/sqlpp/
M asterixdb/asterix-doc/src/main/markdown/sqlpp/
M asterixdb/asterix-doc/src/main/markdown/sqlpp/
M asterixdb/asterix-lang-sqlpp/src/main/javacc/SQLPP.jj
4 files changed, 7 insertions(+), 47 deletions(-)

  git pull ssh:// refs/changes/33/1233/1

diff --git a/asterixdb/asterix-doc/src/main/markdown/sqlpp/ b/asterixdb/asterix-doc/src/main/markdown/sqlpp/
index 0e834e6..79f9da0 100644
--- a/asterixdb/asterix-doc/src/main/markdown/sqlpp/
+++ b/asterixdb/asterix-doc/src/main/markdown/sqlpp/
@@ -25,8 +25,9 @@
                        | <TRUE>
                        | <FALSE>
     StringLiteral  ::= "\'" (<ESCAPE_APOS> | ~["\'"])* "\'"
-                       | "\"" (<ESCAPE_APOS> | ~["\'"])* "\""
+                       | "\"" (<ESCAPE_QUOT> | ~["\'"])* "\""
     <ESCAPE_APOS>  ::= "\\\'"
+    <ESCAPE_QUOT>  ::= "\\\""
     IntegerLiteral ::= <DIGITS>
     <DIGITS>       ::= ["0" - "9"]+
     FloatLiteral   ::= <DIGITS> ( "f" | "F" )
@@ -35,10 +36,6 @@
     DoubleLiteral  ::= <DIGITS>
                      | <DIGITS> ( "." <DIGITS> )?
                      | "." <DIGITS>
-> MC: I tentatively deleted the following unused ESCAPE_QUOTE definition: &lt;ESCAPE_QUOT&gt;
 ::= "\\\""
-> 		&lt;ESCAPE_QUOT&gt;  ::= "\\\""
-> Also, I moved the DelimitedIdentifier down further per TW's suggestion.
 Literals (constants) in SQL++ can be strings, integers, floating point values, double values,
boolean constants, or special constant values like `NULL` and `MISSING`. The `NULL` value
is like a `NULL` in SQL; it is used to represent an unknown field value. The specialy value
`MISSING` is only meaningful in the context of SQL++ field accesses; it occurs when the accessed
field simply does not exist at all in a record being accessed.
diff --git a/asterixdb/asterix-doc/src/main/markdown/sqlpp/ b/asterixdb/asterix-doc/src/main/markdown/sqlpp/
index 04d8e5b..66fd8f1 100644
--- a/asterixdb/asterix-doc/src/main/markdown/sqlpp/
+++ b/asterixdb/asterix-doc/src/main/markdown/sqlpp/
@@ -8,8 +8,6 @@
 The following shows the (rich) grammar for the `SELECT` statement in SQL++.
-> TW: Should we replace SelectElement with SelectValue? MC: Yes, and done below.
     SelectStatement    ::= ( WithClause )?
                            SelectSetOperation (OrderbyClause )? ( LimitClause )?
     SelectSetOperation ::= SelectBlock (<UNION> <ALL> ( SelectBlock | Subquery
) )*
@@ -642,8 +640,6 @@
     GROUP BY msg.authorId AS uid GROUP AS `$1`(msg AS msg);
 > TW: We really need to do something about `COLL_SQL-COUNT`.
-> MC: You mean about its name? And inconsistent dashing? I agree...!  :-)
-> Also, do we need to say anything about the (mandatory) double parens here?
 The same sort of rewritings apply to the function symbols `SUM`, `MAX`, `MIN`, and `AVG`.
 In contrast to the SQL++ collection aggregate functions, these special SQL-92 function symbols
diff --git a/asterixdb/asterix-doc/src/main/markdown/sqlpp/ b/asterixdb/asterix-doc/src/main/markdown/sqlpp/
index 9dc6947..e83e47d 100644
--- a/asterixdb/asterix-doc/src/main/markdown/sqlpp/
+++ b/asterixdb/asterix-doc/src/main/markdown/sqlpp/
@@ -15,9 +15,6 @@
 manipulation purposes as well as controlling the context to be used in evaluating SQL++ expressions.
 This section details the DDL and DML statements supported in the SQL++ language as realized
in Apache AsterixDB.
-> TW: AsterixDB?
-> MC: Good question here - I eradicated the preceding references except in the Intro,
which needs a rewrite, but here it is really still about AsterixDB, I think?  (Since most
of these statements will be hidden in the Couchbase case?)
 ## <a id="Declarations">Declarations</a>
     DatabaseDeclaration ::= "USE" Identifier
@@ -55,7 +52,6 @@
       { "id": 2, "name": "IsbelDull", "friendCount": 2 }
 ## <a id="Lifecycle_management_statements">Lifecycle management statements</a>
@@ -74,17 +70,12 @@
 ### <a id="Dataverses"> Dataverses</a>
-    DatabaseSpecification ::= "DATAVERSE" Identifier IfNotExists ( "WITH" "FORMAT" StringLiteral
+    DatabaseSpecification ::= "DATAVERSE" Identifier IfNotExists
 The CREATE DATAVERSE statement is used to create new dataverses.
 To ease the authoring of reusable SQL++ scripts, an optional IF NOT EXISTS clause is included
to allow
 creation to be requested either unconditionally or only if the dataverse does not already
 If this clause is absent, an error is returned if a dataverse with the indicated name already
-(Note: The `WITH FORMAT` clause in the syntax above is a placeholder for possible `future
-that can safely be ignored here.)
-> MC: Should we get rid of WITH FORMAT? (I think we should - here and in the system -
if we ever do it
-I would actually expect it to be more fine-grained than the dataverse level.)
 The following example creates a new dataverse named TinySocial if one does not already exist.
@@ -94,7 +85,7 @@
 ### <a id="Types"> Types</a>
-    TypeSpecification    ::= "TYPE" FunctionOrTypeName IfNotExists "AS" TypeExpr
+    TypeSpecification    ::= "TYPE" FunctionOrTypeName IfNotExists "AS" RecordTypeDef
     FunctionOrTypeName   ::= QualifiedName
     IfNotExists          ::= ( <IF> <NOT> <EXISTS> )?
     TypeExpr             ::= RecordTypeDef | TypeReference | OrderedListTypeDef | UnorderedListTypeDef
@@ -106,9 +97,6 @@
     OrderedListTypeDef   ::= "[" ( TypeExpr ) "]"
     UnorderedListTypeDef ::= "{{" ( TypeExpr ) "}}"
-> TW: How should we refer to the data model? "Asterix Data Model" seems system specific.
-> MC: Agreed that this is an issue. Let's first decide and I can handle the issue in a
later pass.
 The CREATE TYPE statement is used to create a new named ADM datatype.
 This type can then be used to create stored collections or utilized when defining one or
more other ADM datatypes.
 Much more information about the Asterix Data Model (ADM) is available in the [data model
reference guide](datamodel.html) to ADM.
@@ -116,8 +104,6 @@
 A record type can be defined as being either open or closed.
 Instances of a closed record type are not permitted to contain fields other than those specified
in the create type statement.
 Instances of an open record type may carry additional fields, and open is the default for
new types if neither option is specified.
-> MC: I had forgotten about options other than using CREATE TYPE to introduce new record
types! (Are all of the other AS TypeExpr possibilities actually well-tested?)
 The following example creates a new ADM record type called GleambookUser type.
 Since it is defined as (defaulting to) being an open type,
@@ -171,11 +157,6 @@
     PrimaryKey           ::= <PRIMARY> <KEY> NestedField ( "," NestedField )*
     CompactionPolicy     ::= Identifier
-> TW: Again, a lot of AsterixDB in the following paragraph.
-> Also, while I'm sure that this was always like this, the separation of `Configuration`
-> from `Properties` looks pretty confusing ...
-> MC: Not sure what we should do about all this, actually! (I don't disagree. New JSON
syntax coming, too?)
 The CREATE DATASET statement is used to create a new dataset.
 Datasets are named, unordered collections of ADM record type instances;
 they are where data lives persistently and are the usual targets for SQL++ queries.
@@ -187,9 +168,6 @@
 Internal datasets contain several advanced options that can be specified when appropriate.
 One such option is that random primary key (UUID) values can be auto-generated by declaring
the field to be UUID and putting "AUTOGENERATED" after the "PRIMARY KEY" identifier.
 In this case, unlike other non-optional fields, a value for the auto-generated PK field should
not be provided at insertion time by the user since each record's primary key field value
will be auto-generated by the system.
-> TW: "The Filter-Based LSM Index Acceleration" seems to be quite system specific ...
-> MC: Indeed, but that is always inescapable in DDL reference manuals, no? (We have to
decide what to say where. :-))
 Another advanced option, when creating an Internal dataset, is to specify the merge policy
to control which of the
 underlying LSM storage components to be merged.
@@ -268,8 +246,6 @@
 `ENFORCING` an open field introduces a check that makes sure that the actual type of the
indexed field
 (if the optional field exists in the record) always matches this specified (open) field type.
-*Editor's note: The ? shown above after the type is intended to be mandatory, and we need
to make that happen.*
 The following example creates a btree index called gbAuthorIdx on the authorId field of the
GleambookMessages dataset.
 This index can be useful for accelerating exact-match queries, range search queries, and
joins involving the author-id
@@ -284,8 +260,6 @@
 #### Example
     CREATE INDEX gbSendTimeIdx ON GleambookMessages(sendTime: datetime?) TYPE BTREE ENFORCED;
-> MC: The above works in my branch (with ? mandatory) but not in the main branch. We need
to change that. :-)
 The following example creates a btree index called crpUserScrNameIdx on screenName,
 a nested field residing within a record-valued user field in the ChirpMessages dataset.
@@ -389,10 +363,6 @@
     InsertStatement ::= <INSERT> <INTO> QualifiedName Query
-> TW: AsterixDB-specifc transactions semantics ...
-> Also, do we also support `UPSERT`?
-> MC: Yes to both. :-) Whoops. Wait, maybe not. We do have upsert in AQL, but not in SQL++
today, it seems. I'll document it anyway...? :-)
 The SQL++ INSERT statement is used to insert new data into a dataset.
 The data to be inserted comes from a SQL++ query expression.
 This expression can be as simple as a constant expression, or in general it can be any legal
SQL++ query.
@@ -430,7 +400,7 @@
 ### <a id="Deletes">DELETEs</a>
-    DeleteStatement ::= <DELETE> <FROM> QualifiedName ( (<AS>)? Variable
)? ( <WHERE> Expression )?
+    DeleteStatement ::= <DELETE> <FROM> QualifiedName ( ( <AS> )? Variable
)? ( <WHERE> Expression )?
 The SQL++ DELETE statement is used to delete data from a target dataset.
 The data to be deleted is identified by a boolean expression involving the variable bound
to the target dataset in the DELETE statement.
diff --git a/asterixdb/asterix-lang-sqlpp/src/main/javacc/SQLPP.jj b/asterixdb/asterix-lang-sqlpp/src/main/javacc/SQLPP.jj
index f330f40..6c4bc5c 100644
--- a/asterixdb/asterix-lang-sqlpp/src/main/javacc/SQLPP.jj
+++ b/asterixdb/asterix-lang-sqlpp/src/main/javacc/SQLPP.jj
@@ -414,7 +414,7 @@
   <TYPE> nameComponents = TypeName() ifNotExists = IfNotExists()
-  <AS> typeExpr = TypeExpr()
+  <AS> typeExpr = RecordTypeDef()
       long numValues = -1;
       String filename = null;
@@ -683,14 +683,12 @@
   String dvName = null;
   boolean ifNotExists = false;
-  String format = null;
   <DATAVERSE> dvName = Identifier()
   ifNotExists = IfNotExists()
-  ( LOOKAHEAD(1) <WITH> <FORMAT> format = ConstantString() )?
-      return new CreateDataverseStatement(new Identifier(dvName), format, ifNotExists);
+      return new CreateDataverseStatement(new Identifier(dvName), null, ifNotExists);
@@ -3086,7 +3084,6 @@
   | <FILTER : "filter">
   | <FLATTEN : "flatten">
   | <FOR : "for">
-  | <FORMAT : "format">
   | <FROM : "from">
   | <FULL : "full">
   | <FUNCTION : "function">

To view, visit
To unsubscribe, visit

Gerrit-MessageType: newchange
Gerrit-Change-Id: I6b4f5c7ae48c022a6b8f8c48b3927e1981b70598
Gerrit-PatchSet: 1
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Till Westmann <>

View raw message