Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 9A4DC200BCB for ; Thu, 24 Nov 2016 09:04:51 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 9856F160B1E; Thu, 24 Nov 2016 08:04:51 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 24F3C160B11 for ; Thu, 24 Nov 2016 09:04:49 +0100 (CET) Received: (qmail 42400 invoked by uid 500); 24 Nov 2016 08:04:49 -0000 Mailing-List: contact commits-help@impala.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@impala.incubator.apache.org Delivered-To: mailing list commits@impala.incubator.apache.org Received: (qmail 42391 invoked by uid 99); 24 Nov 2016 08:04:49 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 24 Nov 2016 08:04:49 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id C1B4CC09A1 for ; Thu, 24 Nov 2016 08:04:48 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -5.969 X-Spam-Level: X-Spam-Status: No, score=-5.969 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, KAM_LAZY_DOMAIN_SECURITY=1, KAM_LOTSOFHASH=0.25, RCVD_IN_DNSWL_HI=-5, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RP_MATCHES_RCVD=-2.999] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id 2ZBZxWUxGmCC for ; Thu, 24 Nov 2016 08:04:44 +0000 (UTC) Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with SMTP id 0881C5F307 for ; Thu, 24 Nov 2016 08:04:42 +0000 (UTC) Received: (qmail 42180 invoked by uid 99); 24 Nov 2016 08:04:42 -0000 Received: from git1-us-west.apache.org (HELO git1-us-west.apache.org) (140.211.11.23) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 24 Nov 2016 08:04:42 +0000 Received: by git1-us-west.apache.org (ASF Mail Server at git1-us-west.apache.org, from userid 33) id EBF9FE36D5; Thu, 24 Nov 2016 08:04:41 +0000 (UTC) Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit From: tarmstrong@apache.org To: commits@impala.incubator.apache.org Date: Thu, 24 Nov 2016 08:04:41 -0000 Message-Id: <6165e5d1c843477abaf1925a13471c27@git.apache.org> X-Mailer: ASF-Git Admin Mailer Subject: [1/7] incubator-impala git commit: IMPALA-4283: Ensure Kudu-specific lineage and audit behavior archived-at: Thu, 24 Nov 2016 08:04:51 -0000 Repository: incubator-impala Updated Branches: refs/heads/master b82eed5ee -> 16552f6ed IMPALA-4283: Ensure Kudu-specific lineage and audit behavior With this commit we add support for auditing all Kudu-specific operations and we enable column lineage for INSERT and UPSERT statements on Kudu tables. No lineage output is generated for DELETE and UPDATE statements. Change-Id: Idc4ca1cd63bcfa4370c240a5c4a4126ed6704f4d Reviewed-on: http://gerrit.cloudera.org:8080/5151 Reviewed-by: Dimitris Tsirogiannis Tested-by: Internal Jenkins Project: http://git-wip-us.apache.org/repos/asf/incubator-impala/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-impala/commit/3934e13b Tree: http://git-wip-us.apache.org/repos/asf/incubator-impala/tree/3934e13b Diff: http://git-wip-us.apache.org/repos/asf/incubator-impala/diff/3934e13b Branch: refs/heads/master Commit: 3934e13b3b9da503f12363ba5d72cf3f151fd58b Parents: b82eed5 Author: Dimitris Tsirogiannis Authored: Fri Nov 18 23:26:26 2016 -0800 Committer: Internal Jenkins Committed: Thu Nov 24 01:21:56 2016 +0000 ---------------------------------------------------------------------- .../impala/analysis/ColumnLineageGraph.java | 2 - .../org/apache/impala/analysis/InsertStmt.java | 7 + .../org/apache/impala/analysis/ModifyStmt.java | 4 +- .../java/org/apache/impala/planner/Planner.java | 30 +- .../apache/impala/planner/PlannerContext.java | 3 + .../apache/impala/analysis/AuditingTest.java | 76 ++- .../apache/impala/planner/PlannerTestBase.java | 3 +- .../queries/PlannerTest/lineage.test | 528 ++++++++++++++++++- 8 files changed, 614 insertions(+), 39 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/3934e13b/fe/src/main/java/org/apache/impala/analysis/ColumnLineageGraph.java ---------------------------------------------------------------------- diff --git a/fe/src/main/java/org/apache/impala/analysis/ColumnLineageGraph.java b/fe/src/main/java/org/apache/impala/analysis/ColumnLineageGraph.java index d84cccc..92b7106 100644 --- a/fe/src/main/java/org/apache/impala/analysis/ColumnLineageGraph.java +++ b/fe/src/main/java/org/apache/impala/analysis/ColumnLineageGraph.java @@ -17,9 +17,7 @@ package org.apache.impala.analysis; -import java.text.SimpleDateFormat; import java.util.Collection; -import java.util.Date; import java.util.LinkedHashMap; import java.util.List; import java.util.Map; http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/3934e13b/fe/src/main/java/org/apache/impala/analysis/InsertStmt.java ---------------------------------------------------------------------- diff --git a/fe/src/main/java/org/apache/impala/analysis/InsertStmt.java b/fe/src/main/java/org/apache/impala/analysis/InsertStmt.java index 4ae253e..1dacf48 100644 --- a/fe/src/main/java/org/apache/impala/analysis/InsertStmt.java +++ b/fe/src/main/java/org/apache/impala/analysis/InsertStmt.java @@ -789,6 +789,13 @@ public class InsertStmt extends StatementBase { public boolean hasClusteredHint() { return hasClusteredHint_; } public ArrayList getPrimaryKeyExprs() { return primaryKeyExprs_; } + public List getMentionedColumns() { + List result = Lists.newArrayList(); + List columns = table_.getColumns(); + for (Integer i: mentionedColumns_) result.add(columns.get(i).getName()); + return result; + } + public DataSink createDataSink() { // analyze() must have been called before. Preconditions.checkState(table_ != null); http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/3934e13b/fe/src/main/java/org/apache/impala/analysis/ModifyStmt.java ---------------------------------------------------------------------- diff --git a/fe/src/main/java/org/apache/impala/analysis/ModifyStmt.java b/fe/src/main/java/org/apache/impala/analysis/ModifyStmt.java index 0e94b29..f4a48b2 100644 --- a/fe/src/main/java/org/apache/impala/analysis/ModifyStmt.java +++ b/fe/src/main/java/org/apache/impala/analysis/ModifyStmt.java @@ -149,9 +149,7 @@ public abstract class ModifyStmt extends StatementBase { // Make sure that the user is allowed to modify the target table, since no // UPDATE / DELETE privilege exists, we reuse the INSERT one. - analyzer.registerPrivReq(new PrivilegeRequestBuilder() - .onTable(table_.getDb().getName(), table_.getName()) - .allOf(Privilege.INSERT).toRequest()); + analyzer.registerAuthAndAuditEvent(dstTbl, Privilege.INSERT); // Validates the assignments_ and creates the sourceStmt_. if (sourceStmt_ == null) createSourceStmt(analyzer); http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/3934e13b/fe/src/main/java/org/apache/impala/planner/Planner.java ---------------------------------------------------------------------- diff --git a/fe/src/main/java/org/apache/impala/planner/Planner.java b/fe/src/main/java/org/apache/impala/planner/Planner.java index a1566b0..1762144 100644 --- a/fe/src/main/java/org/apache/impala/planner/Planner.java +++ b/fe/src/main/java/org/apache/impala/planner/Planner.java @@ -176,17 +176,35 @@ public class Planner { ColumnLineageGraph graph = ctx_.getRootAnalyzer().getColumnLineageGraph(); if (BackendConfig.INSTANCE.getComputeLineage() || RuntimeEnv.INSTANCE.isTestEnv()) { + // Lineage is disabled for UPDATE AND DELETE statements + if (ctx_.isUpdateOrDelete()) return fragments; // Compute the column lineage graph if (ctx_.isInsertOrCtas()) { - Table targetTable = ctx_.getAnalysisResult().getInsertStmt().getTargetTable(); - graph.addTargetColumnLabels(targetTable); - Preconditions.checkNotNull(targetTable); - // Lineage is not currently supported for Kudu tables (see IMPALA-4283) - if (targetTable instanceof KuduTable) return fragments; + InsertStmt insertStmt = ctx_.getAnalysisResult().getInsertStmt(); List exprs = Lists.newArrayList(); - if (targetTable instanceof HBaseTable) { + Table targetTable = insertStmt.getTargetTable(); + Preconditions.checkNotNull(targetTable); + if (targetTable instanceof KuduTable) { + if (ctx_.isInsert()) { + // For insert statements on Kudu tables, we only need to consider + // the labels of columns mentioned in the column list. + List mentionedColumns = insertStmt.getMentionedColumns(); + Preconditions.checkState(!mentionedColumns.isEmpty()); + List targetColLabels = Lists.newArrayList(); + String tblFullName = targetTable.getFullName(); + for (String column: mentionedColumns) { + targetColLabels.add(tblFullName + "." + column); + } + graph.addTargetColumnLabels(targetColLabels); + } else { + graph.addTargetColumnLabels(targetTable); + } + exprs.addAll(resultExprs); + } else if (targetTable instanceof HBaseTable) { + graph.addTargetColumnLabels(targetTable); exprs.addAll(resultExprs); } else { + graph.addTargetColumnLabels(targetTable); exprs.addAll(ctx_.getAnalysisResult().getInsertStmt().getPartitionKeyExprs()); exprs.addAll(resultExprs.subList(0, targetTable.getNonClusteringColumns().size())); http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/3934e13b/fe/src/main/java/org/apache/impala/planner/PlannerContext.java ---------------------------------------------------------------------- diff --git a/fe/src/main/java/org/apache/impala/planner/PlannerContext.java b/fe/src/main/java/org/apache/impala/planner/PlannerContext.java index 721acf9..06208aa 100644 --- a/fe/src/main/java/org/apache/impala/planner/PlannerContext.java +++ b/fe/src/main/java/org/apache/impala/planner/PlannerContext.java @@ -89,6 +89,9 @@ public class PlannerContext { public boolean isInsertOrCtas() { return analysisResult_.isInsertStmt() || analysisResult_.isCreateTableAsSelectStmt(); } + public boolean isInsert() { return analysisResult_.isInsertStmt(); } + public boolean isUpdateOrDelete() { + return analysisResult_.isUpdateStmt() || analysisResult_.isDeleteStmt(); } public boolean isQuery() { return analysisResult_.isQueryStmt(); } public boolean hasTableSink() { return isInsertOrCtas() || analysisResult_.isUpdateStmt() http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/3934e13b/fe/src/test/java/org/apache/impala/analysis/AuditingTest.java ---------------------------------------------------------------------- diff --git a/fe/src/test/java/org/apache/impala/analysis/AuditingTest.java b/fe/src/test/java/org/apache/impala/analysis/AuditingTest.java index eee9cce..df23c1e 100644 --- a/fe/src/test/java/org/apache/impala/analysis/AuditingTest.java +++ b/fe/src/test/java/org/apache/impala/analysis/AuditingTest.java @@ -106,7 +106,7 @@ public class AuditingTest extends AnalyzerTest { new TAccessEvent("functional.alltypes", TCatalogObjectType.TABLE, "INSERT"))); // Insert + inline-view. - accessEvents = AnalyzeAccessEvents( + accessEvents = AnalyzeAccessEvents( "insert into functional.alltypes partition(month,year) " + "select b.* from functional.alltypesagg a join (select * from " + "functional_rc.alltypes) b on (a.int_col = b.int_col)"); @@ -370,6 +370,80 @@ public class AuditingTest extends AnalyzerTest { new TAccessEvent("functional.alltypesagg", TCatalogObjectType.TABLE, "SELECT"))); } + @Test + public void TestKuduStatements() throws AuthorizationException, AnalysisException { + TestUtils.assumeKuduIsSupported(); + // Select + Set accessEvents = + AnalyzeAccessEvents("select * from functional_kudu.testtbl"); + Assert.assertEquals(accessEvents, Sets.newHashSet( + new TAccessEvent("functional_kudu.testtbl", TCatalogObjectType.TABLE, "SELECT"))); + + // Insert + accessEvents = AnalyzeAccessEvents( + "insert into functional_kudu.testtbl (id) select id from " + + "functional_kudu.alltypes"); + Assert.assertEquals(accessEvents, Sets.newHashSet( + new TAccessEvent("functional_kudu.alltypes", TCatalogObjectType.TABLE, "SELECT"), + new TAccessEvent("functional_kudu.testtbl", TCatalogObjectType.TABLE, "INSERT"))); + + // Upsert + accessEvents = AnalyzeAccessEvents( + "upsert into functional_kudu.testtbl (id) select id from " + + "functional_kudu.alltypes"); + Assert.assertEquals(accessEvents, Sets.newHashSet( + new TAccessEvent("functional_kudu.alltypes", TCatalogObjectType.TABLE, "SELECT"), + new TAccessEvent("functional_kudu.testtbl", TCatalogObjectType.TABLE, "INSERT"))); + + // Delete + accessEvents = AnalyzeAccessEvents( + "delete from functional_kudu.testtbl where id = 1"); + Assert.assertEquals(accessEvents, Sets.newHashSet( + new TAccessEvent("functional_kudu.testtbl", TCatalogObjectType.TABLE, "SELECT"), + new TAccessEvent("functional_kudu.testtbl", TCatalogObjectType.TABLE, "INSERT"))); + + // Delete using a complex query + accessEvents = AnalyzeAccessEvents( + "delete c from functional_kudu.testtbl c, functional_kudu.alltypes s where " + + "c.id = s.id and s.int_col < 10"); + Assert.assertEquals(accessEvents, Sets.newHashSet( + new TAccessEvent("functional_kudu.testtbl", TCatalogObjectType.TABLE, "SELECT"), + new TAccessEvent("functional_kudu.alltypes", TCatalogObjectType.TABLE, "SELECT"), + new TAccessEvent("functional_kudu.testtbl", TCatalogObjectType.TABLE, "INSERT"))); + + // Update + accessEvents = AnalyzeAccessEvents( + "update functional_kudu.testtbl set name = 'test' where id < 10"); + Assert.assertEquals(accessEvents, Sets.newHashSet( + new TAccessEvent("functional_kudu.testtbl", TCatalogObjectType.TABLE, "SELECT"), + new TAccessEvent("functional_kudu.testtbl", TCatalogObjectType.TABLE, "INSERT"))); + + // Drop table + accessEvents = AnalyzeAccessEvents("drop table functional_kudu.testtbl"); + Assert.assertEquals(accessEvents, Sets.newHashSet(new TAccessEvent( + "functional_kudu.testtbl", TCatalogObjectType.TABLE, "DROP"))); + + // Show create table + accessEvents = AnalyzeAccessEvents("show create table functional_kudu.testtbl"); + Assert.assertEquals(accessEvents, Sets.newHashSet(new TAccessEvent( + "functional_kudu.testtbl", TCatalogObjectType.TABLE, "VIEW_METADATA"))); + + // Compute stats + accessEvents = AnalyzeAccessEvents("compute stats functional_kudu.testtbl"); + Assert.assertEquals(accessEvents, Sets.newHashSet( + new TAccessEvent("functional_kudu.testtbl", TCatalogObjectType.TABLE, "ALTER"))); + + // Describe + accessEvents = AnalyzeAccessEvents("describe functional_kudu.testtbl"); + Assert.assertEquals(accessEvents, Sets.newHashSet(new TAccessEvent( + "functional_kudu.testtbl", TCatalogObjectType.TABLE, "ANY"))); + + // Describe formatted + accessEvents = AnalyzeAccessEvents("describe formatted functional_kudu.testtbl"); + Assert.assertEquals(accessEvents, Sets.newHashSet(new TAccessEvent( + "functional_kudu.testtbl", TCatalogObjectType.TABLE, "VIEW_METADATA"))); + } + /** * Analyzes the given statement and returns the set of TAccessEvents * that were captured as part of analysis. http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/3934e13b/fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java ---------------------------------------------------------------------- diff --git a/fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java b/fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java index 745efa3..4fff233 100644 --- a/fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java +++ b/fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java @@ -653,7 +653,8 @@ public class PlannerTestBase extends FrontendTestBase { ArrayList expectedLineage = testCase.getSectionContents(Section.LINEAGE); if (expectedLineage == null || expectedLineage.isEmpty()) return; TLineageGraph lineageGraph = null; - if (execRequest != null && execRequest.isSetQuery_exec_request()) { + if (execRequest == null) return; + if (execRequest.isSetQuery_exec_request()) { lineageGraph = execRequest.query_exec_request.lineage_graph; } else if (execRequest.isSetCatalog_op_request()) { lineageGraph = execRequest.catalog_op_request.lineage_graph; http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/3934e13b/testdata/workloads/functional-planner/queries/PlannerTest/lineage.test ---------------------------------------------------------------------- diff --git a/testdata/workloads/functional-planner/queries/PlannerTest/lineage.test b/testdata/workloads/functional-planner/queries/PlannerTest/lineage.test index b18a044..c2d916e 100644 --- a/testdata/workloads/functional-planner/queries/PlannerTest/lineage.test +++ b/testdata/workloads/functional-planner/queries/PlannerTest/lineage.test @@ -192,10 +192,10 @@ order by b.bigint_col limit 10 } ==== # CTAS queries -create table t as select int_col, tinyint_col from functional.alltypes +create table lineage_test_tbl as select int_col, tinyint_col from functional.alltypes ---- LINEAGE { - "queryText":"create table t as select int_col, tinyint_col from functional.alltypes", + "queryText":"create table lineage_test_tbl as select int_col, tinyint_col from functional.alltypes", "hash":"f7666959b65ce1aa2a695ae90adb7c85", "user":"dev", "timestamp":1446159271, @@ -223,7 +223,7 @@ create table t as select int_col, tinyint_col from functional.alltypes { "id":0, "vertexType":"COLUMN", - "vertexId":"default.t.int_col" + "vertexId":"default.lineage_test_tbl.int_col" }, { "id":1, @@ -233,7 +233,7 @@ create table t as select int_col, tinyint_col from functional.alltypes { "id":2, "vertexType":"COLUMN", - "vertexId":"default.t.tinyint_col" + "vertexId":"default.lineage_test_tbl.tinyint_col" }, { "id":3, @@ -243,13 +243,13 @@ create table t as select int_col, tinyint_col from functional.alltypes ] } ==== -create table t as +create table lineage_test_tbl as select distinct a.int_col, a.string_col from functional.alltypes a inner join functional.alltypessmall b on (a.id = b.id) where a.year = 2009 and b.month = 2 ---- LINEAGE { - "queryText":"create table t as\nselect distinct a.int_col, a.string_col from functional.alltypes a\ninner join functional.alltypessmall b on (a.id = b.id)\nwhere a.year = 2009 and b.month = 2", + "queryText":"create table lineage_test_tbl as\nselect distinct a.int_col, a.string_col from functional.alltypes a\ninner join functional.alltypessmall b on (a.id = b.id)\nwhere a.year = 2009 and b.month = 2", "hash":"6d83126f8e34eec31ed4e111e1c32e78", "user":"dev", "timestamp":1446159271, @@ -290,7 +290,7 @@ where a.year = 2009 and b.month = 2 { "id":0, "vertexType":"COLUMN", - "vertexId":"default.t.int_col" + "vertexId":"default.lineage_test_tbl.int_col" }, { "id":1, @@ -300,7 +300,7 @@ where a.year = 2009 and b.month = 2 { "id":2, "vertexType":"COLUMN", - "vertexId":"default.t.string_col" + "vertexId":"default.lineage_test_tbl.string_col" }, { "id":3, @@ -330,13 +330,13 @@ where a.year = 2009 and b.month = 2 ] } ==== -create table t as +create table lineage_test_tbl as select * from (select * from (select int_col from functional.alltypestiny limit 1) v1 ) v2 ---- LINEAGE { - "queryText":"create table t as\nselect * from\n (select * from\n (select int_col from functional.alltypestiny limit 1) v1 ) v2", + "queryText":"create table lineage_test_tbl as\nselect * from\n (select * from\n (select int_col from functional.alltypestiny limit 1) v1 ) v2", "hash":"f719f8eba46eda75e9cc560310885558", "user":"dev", "timestamp":1446159271, @@ -355,7 +355,7 @@ select * from { "id":0, "vertexType":"COLUMN", - "vertexId":"default.t.int_col" + "vertexId":"default.lineage_test_tbl.int_col" }, { "id":1, @@ -366,10 +366,10 @@ select * from } ==== # CTAS from HBase table -create table tm as select * from functional_hbase.alltypes limit 5 +create table lineage_test_tblm as select * from functional_hbase.alltypes limit 5 ---- LINEAGE { - "queryText":"create table tm as select * from functional_hbase.alltypes limit 5", + "queryText":"create table lineage_test_tblm as select * from functional_hbase.alltypes limit 5", "hash":"bedebc5bc72bbc6aec385c514944daae", "user":"dev", "timestamp":1446159271, @@ -496,7 +496,7 @@ create table tm as select * from functional_hbase.alltypes limit 5 { "id":0, "vertexType":"COLUMN", - "vertexId":"default.tm.id" + "vertexId":"default.lineage_test_tblm.id" }, { "id":1, @@ -506,7 +506,7 @@ create table tm as select * from functional_hbase.alltypes limit 5 { "id":2, "vertexType":"COLUMN", - "vertexId":"default.tm.bigint_col" + "vertexId":"default.lineage_test_tblm.bigint_col" }, { "id":3, @@ -516,7 +516,7 @@ create table tm as select * from functional_hbase.alltypes limit 5 { "id":4, "vertexType":"COLUMN", - "vertexId":"default.tm.bool_col" + "vertexId":"default.lineage_test_tblm.bool_col" }, { "id":5, @@ -526,7 +526,7 @@ create table tm as select * from functional_hbase.alltypes limit 5 { "id":6, "vertexType":"COLUMN", - "vertexId":"default.tm.date_string_col" + "vertexId":"default.lineage_test_tblm.date_string_col" }, { "id":7, @@ -536,7 +536,7 @@ create table tm as select * from functional_hbase.alltypes limit 5 { "id":8, "vertexType":"COLUMN", - "vertexId":"default.tm.double_col" + "vertexId":"default.lineage_test_tblm.double_col" }, { "id":9, @@ -546,7 +546,7 @@ create table tm as select * from functional_hbase.alltypes limit 5 { "id":10, "vertexType":"COLUMN", - "vertexId":"default.tm.float_col" + "vertexId":"default.lineage_test_tblm.float_col" }, { "id":11, @@ -556,7 +556,7 @@ create table tm as select * from functional_hbase.alltypes limit 5 { "id":12, "vertexType":"COLUMN", - "vertexId":"default.tm.int_col" + "vertexId":"default.lineage_test_tblm.int_col" }, { "id":13, @@ -566,7 +566,7 @@ create table tm as select * from functional_hbase.alltypes limit 5 { "id":14, "vertexType":"COLUMN", - "vertexId":"default.tm.month" + "vertexId":"default.lineage_test_tblm.month" }, { "id":15, @@ -576,7 +576,7 @@ create table tm as select * from functional_hbase.alltypes limit 5 { "id":16, "vertexType":"COLUMN", - "vertexId":"default.tm.smallint_col" + "vertexId":"default.lineage_test_tblm.smallint_col" }, { "id":17, @@ -586,7 +586,7 @@ create table tm as select * from functional_hbase.alltypes limit 5 { "id":18, "vertexType":"COLUMN", - "vertexId":"default.tm.string_col" + "vertexId":"default.lineage_test_tblm.string_col" }, { "id":19, @@ -596,7 +596,7 @@ create table tm as select * from functional_hbase.alltypes limit 5 { "id":20, "vertexType":"COLUMN", - "vertexId":"default.tm.timestamp_col" + "vertexId":"default.lineage_test_tblm.timestamp_col" }, { "id":21, @@ -606,7 +606,7 @@ create table tm as select * from functional_hbase.alltypes limit 5 { "id":22, "vertexType":"COLUMN", - "vertexId":"default.tm.tinyint_col" + "vertexId":"default.lineage_test_tblm.tinyint_col" }, { "id":23, @@ -616,7 +616,7 @@ create table tm as select * from functional_hbase.alltypes limit 5 { "id":24, "vertexType":"COLUMN", - "vertexId":"default.tm.year" + "vertexId":"default.lineage_test_tblm.year" }, { "id":25, @@ -4453,3 +4453,479 @@ where not exists (select 1 from functional.alltypes a where v.id = a.id) ] } ==== +# Select query that accesses a Kudu table +select count(*) from functional_kudu.alltypes k join functional.alltypes h on k.id = h.id +where k.int_col < 10 +---- LINEAGE +{ + "queryText":"select count(*) from functional_kudu.alltypes k join functional.alltypes h on k.id = h.id\nwhere k.int_col < 10", + "hash":"7b7c92d488186d869bb6b78c97666f41", + "user":"dev", + "timestamp":1479538352, + "edges":[ + { + "sources":[ + ], + "targets":[ + 0 + ], + "edgeType":"PROJECTION" + }, + { + "sources":[ + 1, + 2, + 3 + ], + "targets":[ + 0 + ], + "edgeType":"PREDICATE" + } + ], + "vertices":[ + { + "id":0, + "vertexType":"COLUMN", + "vertexId":"count(*)" + }, + { + "id":1, + "vertexType":"COLUMN", + "vertexId":"functional_kudu.alltypes.int_col" + }, + { + "id":2, + "vertexType":"COLUMN", + "vertexId":"functional.alltypes.id" + }, + { + "id":3, + "vertexType":"COLUMN", + "vertexId":"functional_kudu.alltypes.id" + } + ] +} +==== +# Insert into a Kudu table +insert into functional_kudu.testtbl select id, string_col as name, int_col as zip from +functional.alltypes a where a.id < 100 +---- LINEAGE +{ + "queryText":"insert into functional_kudu.testtbl select id, string_col as name, int_col as zip from\nfunctional.alltypes a where a.id < 100", + "hash":"87a59bac56c6ad27f7af6e71af46d552", + "user":"dev", + "timestamp":1479539012, + "edges":[ + { + "sources":[ + 1 + ], + "targets":[ + 0 + ], + "edgeType":"PROJECTION" + }, + { + "sources":[ + 3 + ], + "targets":[ + 2 + ], + "edgeType":"PROJECTION" + }, + { + "sources":[ + 5 + ], + "targets":[ + 4 + ], + "edgeType":"PROJECTION" + }, + { + "sources":[ + 1 + ], + "targets":[ + 0, + 2, + 4 + ], + "edgeType":"PREDICATE" + } + ], + "vertices":[ + { + "id":0, + "vertexType":"COLUMN", + "vertexId":"functional_kudu.testtbl.id" + }, + { + "id":1, + "vertexType":"COLUMN", + "vertexId":"functional.alltypes.id" + }, + { + "id":2, + "vertexType":"COLUMN", + "vertexId":"functional_kudu.testtbl.name" + }, + { + "id":3, + "vertexType":"COLUMN", + "vertexId":"functional.alltypes.string_col" + }, + { + "id":4, + "vertexType":"COLUMN", + "vertexId":"functional_kudu.testtbl.zip" + }, + { + "id":5, + "vertexType":"COLUMN", + "vertexId":"functional.alltypes.int_col" + } + ] +} +==== +# Insert into a Kudu table with a column list specified +insert into functional_kudu.testtbl (name, id) select string_col as name, id from +functional.alltypes where id < 10 +---- LINEAGE +{ + "queryText":"insert into functional_kudu.testtbl (name, id) select string_col as name, id from\nfunctional.alltypes where id < 10", + "hash":"0bccfdbf4118e6d5a3d94062ecb5130a", + "user":"dev", + "timestamp":1479933751, + "edges":[ + { + "sources":[ + 1 + ], + "targets":[ + 0 + ], + "edgeType":"PROJECTION" + }, + { + "sources":[ + 3 + ], + "targets":[ + 2 + ], + "edgeType":"PROJECTION" + }, + { + "sources":[ + 1 + ], + "targets":[ + 0, + 2 + ], + "edgeType":"PREDICATE" + } + ], + "vertices":[ + { + "id":0, + "vertexType":"COLUMN", + "vertexId":"functional_kudu.testtbl.id" + }, + { + "id":1, + "vertexType":"COLUMN", + "vertexId":"functional.alltypes.id" + }, + { + "id":2, + "vertexType":"COLUMN", + "vertexId":"functional_kudu.testtbl.name" + }, + { + "id":3, + "vertexType":"COLUMN", + "vertexId":"functional.alltypes.string_col" + } + ] +} +==== +# Upsert into a Kudu table with a column list specified +upsert into functional_kudu.testtbl (name, id) select string_col as name, id from +functional.alltypes where id < 10 +---- LINEAGE +{ + "queryText":"upsert into functional_kudu.testtbl (name, id) select string_col as name, id from\nfunctional.alltypes where id < 10", + "hash":"f4c1e7b016e75012f7268f2f42ae5630", + "user":"dev", + "timestamp":1479933751, + "edges":[ + { + "sources":[ + 1 + ], + "targets":[ + 0 + ], + "edgeType":"PROJECTION" + }, + { + "sources":[ + 3 + ], + "targets":[ + 2 + ], + "edgeType":"PROJECTION" + }, + { + "sources":[ + 1 + ], + "targets":[ + 0, + 2 + ], + "edgeType":"PREDICATE" + } + ], + "vertices":[ + { + "id":0, + "vertexType":"COLUMN", + "vertexId":"functional_kudu.testtbl.id" + }, + { + "id":1, + "vertexType":"COLUMN", + "vertexId":"functional.alltypes.id" + }, + { + "id":2, + "vertexType":"COLUMN", + "vertexId":"functional_kudu.testtbl.name" + }, + { + "id":3, + "vertexType":"COLUMN", + "vertexId":"functional.alltypes.string_col" + } + ] +} +==== +# CTAS a Kudu table +create table kudu_ctas primary key (id) distribute by hash (id) into 3 buckets +stored as kudu as select id, bool_col, tinyint_col, smallint_col, int_col, +bigint_col, float_col, double_col, date_string_col, string_col +from functional.alltypestiny +---- LINEAGE +{ + "queryText":"create table kudu_ctas primary key (id) distribute by hash (id) into 3 buckets\nstored as kudu as select id, bool_col, tinyint_col, smallint_col, int_col,\nbigint_col, float_col, double_col, date_string_col, string_col\nfrom functional.alltypestiny", + "hash":"6e3e192c7fb8bb6b22674a9b7b488b55", + "user":"dev", + "timestamp":1479933751, + "edges":[ + { + "sources":[ + 1 + ], + "targets":[ + 0 + ], + "edgeType":"PROJECTION" + }, + { + "sources":[ + 3 + ], + "targets":[ + 2 + ], + "edgeType":"PROJECTION" + }, + { + "sources":[ + 5 + ], + "targets":[ + 4 + ], + "edgeType":"PROJECTION" + }, + { + "sources":[ + 7 + ], + "targets":[ + 6 + ], + "edgeType":"PROJECTION" + }, + { + "sources":[ + 9 + ], + "targets":[ + 8 + ], + "edgeType":"PROJECTION" + }, + { + "sources":[ + 11 + ], + "targets":[ + 10 + ], + "edgeType":"PROJECTION" + }, + { + "sources":[ + 13 + ], + "targets":[ + 12 + ], + "edgeType":"PROJECTION" + }, + { + "sources":[ + 15 + ], + "targets":[ + 14 + ], + "edgeType":"PROJECTION" + }, + { + "sources":[ + 17 + ], + "targets":[ + 16 + ], + "edgeType":"PROJECTION" + }, + { + "sources":[ + 19 + ], + "targets":[ + 18 + ], + "edgeType":"PROJECTION" + } + ], + "vertices":[ + { + "id":0, + "vertexType":"COLUMN", + "vertexId":"default.kudu_ctas.id" + }, + { + "id":1, + "vertexType":"COLUMN", + "vertexId":"functional.alltypestiny.id" + }, + { + "id":2, + "vertexType":"COLUMN", + "vertexId":"default.kudu_ctas.bool_col" + }, + { + "id":3, + "vertexType":"COLUMN", + "vertexId":"functional.alltypestiny.bool_col" + }, + { + "id":4, + "vertexType":"COLUMN", + "vertexId":"default.kudu_ctas.tinyint_col" + }, + { + "id":5, + "vertexType":"COLUMN", + "vertexId":"functional.alltypestiny.tinyint_col" + }, + { + "id":6, + "vertexType":"COLUMN", + "vertexId":"default.kudu_ctas.smallint_col" + }, + { + "id":7, + "vertexType":"COLUMN", + "vertexId":"functional.alltypestiny.smallint_col" + }, + { + "id":8, + "vertexType":"COLUMN", + "vertexId":"default.kudu_ctas.int_col" + }, + { + "id":9, + "vertexType":"COLUMN", + "vertexId":"functional.alltypestiny.int_col" + }, + { + "id":10, + "vertexType":"COLUMN", + "vertexId":"default.kudu_ctas.bigint_col" + }, + { + "id":11, + "vertexType":"COLUMN", + "vertexId":"functional.alltypestiny.bigint_col" + }, + { + "id":12, + "vertexType":"COLUMN", + "vertexId":"default.kudu_ctas.float_col" + }, + { + "id":13, + "vertexType":"COLUMN", + "vertexId":"functional.alltypestiny.float_col" + }, + { + "id":14, + "vertexType":"COLUMN", + "vertexId":"default.kudu_ctas.double_col" + }, + { + "id":15, + "vertexType":"COLUMN", + "vertexId":"functional.alltypestiny.double_col" + }, + { + "id":16, + "vertexType":"COLUMN", + "vertexId":"default.kudu_ctas.date_string_col" + }, + { + "id":17, + "vertexType":"COLUMN", + "vertexId":"functional.alltypestiny.date_string_col" + }, + { + "id":18, + "vertexType":"COLUMN", + "vertexId":"default.kudu_ctas.string_col" + }, + { + "id":19, + "vertexType":"COLUMN", + "vertexId":"functional.alltypestiny.string_col" + } + ] +} +==== +# No lineage should be generated for UPDATE +update functional_kudu.alltypes set int_col = 1 where id = 1 +==== +# No lineage should be generated from DELETE +delete from functional_kudu.alltypes where id = 1 +====