impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dimitris Tsirogiannis (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-3719: Simplify CREATE TABLE statements with Kudu tables
Date Wed, 14 Sep 2016 16:47:49 GMT
Dimitris Tsirogiannis has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/4414

Change subject: IMPALA-3719: Simplify CREATE TABLE statements with Kudu tables
......................................................................

IMPALA-3719: Simplify CREATE TABLE statements with Kudu tables

With this commit we simplify the syntax and handling of CREATE TABLE
statements for both managed and external Kudu tables.

Syntax example:
CREATE TABLE foo(a INT, b STRING, PRIMARY KEY (a, b))
DISTRIBUTE BY HASH (a) INTO 3 BUCKETS,
RANGE (b) SPLIT ROWS (('abc', 'def'))
STORED AS KUDU

Changes:
1) Remove the requirement to specify table properties such as key
  columns in tblproperties.
2) Read table schema from Kudu. When attempting to create an external
  table "foo" in database "bar", Impala will search for a Kudu table
  name "foo.bar" and "bar" (Kudu doesn't have database name spaces
  yet.)
3) The Kudu table is now required to exist at the time of creation in
  Impala.
4) Disallow table properties that could conflict with an existing
  table. Ex: key_columns cannot be specified.
5) Add KUDU as a file format.
6) Add a startup flag to impalad to specify the default Kudu master
  addresses. The flag is used as the default value for the table
  property kudu_master_addresses.
7) Fix a post merge issue (IMPALA-3178) where DROP DATABASE CASCADE
  wasn't implemented for Kudu tables and silently ignored. The Kudu
  tables wouldn't be removed in Kudu.
8) Remove DDL delegates. There was only one functional delegate (for
  Kudu) the existence of the other delegate and the use of delegates in
  general has led to confusion. The Kudu delegate only exists to provide
  functionality missing from Hive. Eventually Hive should have the needed
  functionality and the Kudu delegate (renamed in this patch to KuduCatalogOpExecutor)
  can be removed.
9) Add PRIMARY KEY at the column and table level. This syntax is fairly
  standard. When used at the column level, only one column can be
  marked as a key. When used at the table level, multiple columns can
  be used as a key. Only Kudu tables are allowed to use PRIMARY KEY.
  The old "kudu.key_columns" table property is no longer accepted
  though it is still used internally. "PRIMARY" is now a keyword.
  "KEY" is expected to be common enough that the ident style
  declaration is used instead to avoid conflicts.
10) Infer a Kudu table name if none was given. The table property
  "kudu.table_name" is now optional. If not given, the Kudu table name
  will be created based on the Hive Metastore database and table name.
  "CREATE TABLE foo.bar (i INT PRIMARY KEY) STORED AS KUDU" will create
  a table in Kudu named "foo.bar". If the database is "default" then
  the Kudu table name will not include the database.
11) Several improvements in the grammar related to the family
  of CREATE TABLE statements.
12) Added new tests and modified existing Kudu test to use the new
  CREATE TABLE syntax.
13) Use Kudu master as the source of truth for table metadata insteads
  of HMS. Table/column metadata are still stored in HMS in order to be
  able to use table and column statistics.

Not included in this commit:
- Additional column properties such as nullability and compression
encodings.

Change-Id: I7b9d51b2720ab57649abdb7d5c710ea04ff50dc1
---
M be/src/catalog/catalog.cc
M be/src/service/frontend.cc
M bin/start-catalogd.sh
M bin/start-impala-cluster.py
M common/thrift/CatalogObjects.thrift
M fe/src/main/cup/sql-parser.cup
A fe/src/main/java/com/cloudera/impala/analysis/AnalysisUtils.java
M fe/src/main/java/com/cloudera/impala/analysis/ColumnDef.java
M fe/src/main/java/com/cloudera/impala/analysis/CreateTableAsSelectStmt.java
M fe/src/main/java/com/cloudera/impala/analysis/CreateTableDataSrcStmt.java
M fe/src/main/java/com/cloudera/impala/analysis/CreateTableLikeFileStmt.java
M fe/src/main/java/com/cloudera/impala/analysis/CreateTableLikeStmt.java
M fe/src/main/java/com/cloudera/impala/analysis/CreateTableStmt.java
M fe/src/main/java/com/cloudera/impala/analysis/DistributeParam.java
A fe/src/main/java/com/cloudera/impala/analysis/TableDataLayout.java
A fe/src/main/java/com/cloudera/impala/analysis/TableDef.java
A fe/src/main/java/com/cloudera/impala/analysis/TableDefOptions.java
M fe/src/main/java/com/cloudera/impala/analysis/ToSqlUtils.java
M fe/src/main/java/com/cloudera/impala/catalog/Catalog.java
M fe/src/main/java/com/cloudera/impala/catalog/Db.java
M fe/src/main/java/com/cloudera/impala/catalog/HdfsFileFormat.java
M fe/src/main/java/com/cloudera/impala/catalog/ImpaladCatalog.java
M fe/src/main/java/com/cloudera/impala/catalog/KuduTable.java
M fe/src/main/java/com/cloudera/impala/catalog/Table.java
D fe/src/main/java/com/cloudera/impala/catalog/delegates/DdlDelegate.java
D fe/src/main/java/com/cloudera/impala/catalog/delegates/KuduDdlDelegate.java
D fe/src/main/java/com/cloudera/impala/catalog/delegates/UnsupportedOpDelegate.java
M fe/src/main/java/com/cloudera/impala/planner/HdfsPartitionFilter.java
M fe/src/main/java/com/cloudera/impala/service/CatalogOpExecutor.java
M fe/src/main/java/com/cloudera/impala/service/Frontend.java
M fe/src/main/java/com/cloudera/impala/service/JniCatalog.java
M fe/src/main/java/com/cloudera/impala/service/JniFrontend.java
A fe/src/main/java/com/cloudera/impala/service/KuduCatalogOpExecutor.java
A fe/src/main/java/com/cloudera/impala/util/KuduClient.java
M fe/src/main/java/com/cloudera/impala/util/KuduUtil.java
M fe/src/main/jflex/sql-scanner.flex
M fe/src/test/java/com/cloudera/impala/analysis/AnalyzeDDLTest.java
M fe/src/test/java/com/cloudera/impala/analysis/ParserTest.java
M fe/src/test/java/com/cloudera/impala/service/JdbcTest.java
M fe/src/test/java/com/cloudera/impala/testutil/ImpaladTestCatalog.java
M infra/python/deps/requirements.txt
M testdata/bin/generate-schema-statements.py
M testdata/datasets/functional/functional_schema_template.sql
M testdata/datasets/tpch/tpch_schema_template.sql
D testdata/workloads/functional-query/queries/QueryTest/create_kudu.test
M testdata/workloads/functional-query/queries/QueryTest/kudu-scan-node.test
D testdata/workloads/functional-query/queries/QueryTest/kudu-show-create.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_alter.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_crud.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_partition_ddl.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_stats.test
M tests/common/__init__.py
A tests/common/kudu_test_suite.py
M tests/conftest.py
A tests/custom_cluster/test_kudu.py
M tests/query_test/test_kudu.py
56 files changed, 2,651 insertions(+), 1,870 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/14/4414/1
-- 
To view, visit http://gerrit.cloudera.org:8080/4414
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I7b9d51b2720ab57649abdb7d5c710ea04ff50dc1
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dimitris Tsirogiannis <dtsirogiannis@cloudera.com>

Mime
View raw message