impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Taras Bobrovytsky (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-5036: Parquet count star optimization
Date Wed, 07 Jun 2017 00:58:08 GMT
Taras Bobrovytsky has uploaded a new patch set (#2).

Change subject: IMPALA-5036: Parquet count star optimization
......................................................................

IMPALA-5036: Parquet count star optimization

Instead of materializing empty rows when computing count star, we use
the data stored in the Parquet num rows statistic. The Parquet scanner
tuple is modified to have one slot into which we will write the num rows
statistic. The aggregate function is changed from count to sum.

Testing:
- Ran tests locally

Change-Id: I536b85c014821296aed68a0c68faadae96005e62
---
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M common/thrift/PlanNodes.thrift
M fe/src/main/java/org/apache/impala/analysis/AggregateInfo.java
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java
M fe/src/main/java/org/apache/impala/analysis/FunctionParams.java
M fe/src/main/java/org/apache/impala/analysis/TupleDescriptor.java
M fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/PlanNode.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
A fe/src/main/java/org/apache/impala/rewrite/NormalizeCountStarRule.java
M fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/ddl.test
M testdata/workloads/functional-planner/queries/PlannerTest/distinct.test
M testdata/workloads/functional-planner/queries/PlannerTest/join-order.test
A testdata/workloads/functional-planner/queries/PlannerTest/parquet-stats-agg.test
M testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test
M testdata/workloads/functional-query/queries/QueryTest/aggregation.test
R testdata/workloads/functional-query/queries/QueryTest/parquet-stats.test
M tests/query_test/test_aggregation.py
M tests/query_test/test_parquet_stats.py
25 files changed, 915 insertions(+), 325 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/12/6812/2
-- 
To view, visit http://gerrit.cloudera.org:8080/6812
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I536b85c014821296aed68a0c68faadae96005e62
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky <tbobrovytsky@cloudera.com>
Gerrit-Reviewer: Alex Behm <alex.behm@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <marcel@cloudera.com>
Gerrit-Reviewer: Mostafa Mokhtar <mmokhtar@cloudera.com>

Mime
View raw message