Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id BD0B0200B42 for ; Sun, 26 Jun 2016 04:17:52 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id BBAFB160A6B; Sun, 26 Jun 2016 02:17:52 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id A26BA160A70 for ; Sun, 26 Jun 2016 04:17:50 +0200 (CEST) Received: (qmail 85327 invoked by uid 500); 26 Jun 2016 02:17:49 -0000 Mailing-List: contact commits-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hive-dev@hive.apache.org Delivered-To: mailing list commits@hive.apache.org Received: (qmail 84208 invoked by uid 99); 26 Jun 2016 02:17:48 -0000 Received: from git1-us-west.apache.org (HELO git1-us-west.apache.org) (140.211.11.23) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 26 Jun 2016 02:17:48 +0000 Received: by git1-us-west.apache.org (ASF Mail Server at git1-us-west.apache.org, from userid 33) id 915E3DFA9D; Sun, 26 Jun 2016 02:17:48 +0000 (UTC) Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit From: jcamacho@apache.org To: commits@hive.apache.org Date: Sun, 26 Jun 2016 02:17:56 -0000 Message-Id: In-Reply-To: References: X-Mailer: ASF-Git Admin Mailer Subject: [9/9] hive git commit: HIVE-13982: Extensions to RS dedup: execute with different column order and sorting direction if possible (Jesus Camacho Rodriguez, reviewed by Ashutosh Chauhan) archived-at: Sun, 26 Jun 2016 02:17:52 -0000 HIVE-13982: Extensions to RS dedup: execute with different column order and sorting direction if possible (Jesus Camacho Rodriguez, reviewed by Ashutosh Chauhan) Project: http://git-wip-us.apache.org/repos/asf/hive/repo Commit: http://git-wip-us.apache.org/repos/asf/hive/commit/4accba2f Tree: http://git-wip-us.apache.org/repos/asf/hive/tree/4accba2f Diff: http://git-wip-us.apache.org/repos/asf/hive/diff/4accba2f Branch: refs/heads/branch-2.1 Commit: 4accba2f57d62c2e92acbeda62c068675ae73a08 Parents: 4f04ac5 Author: Jesus Camacho Rodriguez Authored: Sat Jun 25 19:17:31 2016 -0700 Committer: Jesus Camacho Rodriguez Committed: Sat Jun 25 19:17:31 2016 -0700 ---------------------------------------------------------------------- .../org/apache/hadoop/hive/conf/HiveConf.java | 2 + .../ql/optimizer/LimitPushdownOptimizer.java | 14 +- .../calcite/reloperators/HiveAggregate.java | 12 + .../calcite/rules/HiveRelColumnsAlignment.java | 262 +++++ .../calcite/translator/ASTConverter.java | 19 +- .../translator/PlanModifierForASTConv.java | 17 +- .../correlation/ReduceSinkDeDuplication.java | 47 +- .../hadoop/hive/ql/parse/CalcitePlanner.java | 3 +- .../clientpositive/correlationoptimizer13.q | 1 - .../queries/clientpositive/limit_pushdown2.q | 78 ++ .../reduce_deduplicate_extended2.q | 92 ++ .../clientpositive/annotate_stats_join.q.out | 62 +- .../results/clientpositive/bucket_groupby.q.out | 9 +- .../clientpositive/correlationoptimizer13.q.out | 194 ++-- .../clientpositive/dynamic_rdd_cache.q.out | 28 +- .../clientpositive/filter_cond_pushdown.q.out | 12 +- .../clientpositive/limit_pushdown2.q.out | 804 +++++++++++++ .../clientpositive/limit_pushdown3.q.out | 2 + .../results/clientpositive/merge_join_1.q.out | 6 +- .../results/clientpositive/perf/query17.q.out | 6 +- .../results/clientpositive/perf/query19.q.out | 180 +-- .../results/clientpositive/perf/query20.q.out | 4 +- .../results/clientpositive/perf/query25.q.out | 6 +- .../results/clientpositive/perf/query29.q.out | 6 +- .../results/clientpositive/perf/query3.q.out | 88 +- .../results/clientpositive/perf/query39.q.out | 14 +- .../results/clientpositive/perf/query40.q.out | 6 +- .../results/clientpositive/perf/query45.q.out | 18 +- .../results/clientpositive/perf/query46.q.out | 4 +- .../results/clientpositive/perf/query50.q.out | 6 +- .../results/clientpositive/perf/query54.q.out | 128 ++- .../results/clientpositive/perf/query64.q.out | 14 +- .../results/clientpositive/perf/query65.q.out | 64 +- .../results/clientpositive/perf/query68.q.out | 4 +- .../results/clientpositive/perf/query71.q.out | 4 +- .../results/clientpositive/perf/query75.q.out | 36 +- .../results/clientpositive/perf/query85.q.out | 6 +- .../results/clientpositive/perf/query87.q.out | 264 +++-- .../results/clientpositive/perf/query92.q.out | 18 +- .../results/clientpositive/perf/query97.q.out | 18 +- .../results/clientpositive/perf/query98.q.out | 4 +- .../results/clientpositive/ptfgroupbyjoin.q.out | 72 +- .../reduce_deduplicate_extended2.q.out | 1079 ++++++++++++++++++ .../test/results/clientpositive/regex_col.q.out | 12 +- .../test/results/clientpositive/semijoin4.q.out | 12 +- .../test/results/clientpositive/semijoin5.q.out | 12 +- .../spark/annotate_stats_join.q.out | 62 +- .../spark/dynamic_rdd_cache.q.out | 28 +- .../clientpositive/spark/subquery_exists.q.out | 8 +- .../clientpositive/spark/subquery_in.q.out | 26 +- .../spark/vector_outer_join3.q.out | 16 +- .../clientpositive/spark/vectorization_13.q.out | 28 +- .../clientpositive/spark/vectorization_14.q.out | 10 +- .../clientpositive/spark/vectorization_15.q.out | 14 +- .../spark/vectorization_short_regress.q.out | 15 +- .../clientpositive/subquery_exists.q.out | 8 +- .../results/clientpositive/subquery_in.q.out | 34 +- .../clientpositive/subquery_in_having.q.out | 82 +- .../clientpositive/subquery_notexists.q.out | 34 +- .../subquery_notexists_having.q.out | 56 +- .../results/clientpositive/subquery_notin.q.out | 12 +- .../subquery_unqualcolumnrefs.q.out | 150 +-- .../results/clientpositive/subquery_views.q.out | 12 +- .../clientpositive/tez/explainuser_1.q.out | 458 ++++---- .../clientpositive/tez/explainuser_2.q.out | 900 ++++++++------- .../clientpositive/tez/subquery_exists.q.out | 8 +- .../clientpositive/tez/subquery_in.q.out | 26 +- .../tez/vector_groupby_reduce.q.out | 66 +- .../tez/vector_interval_mapjoin.q.out | 8 +- .../clientpositive/tez/vector_outer_join3.q.out | 16 +- .../clientpositive/tez/vectorization_13.q.out | 28 +- .../clientpositive/tez/vectorization_14.q.out | 10 +- .../clientpositive/tez/vectorization_15.q.out | 14 +- .../tez/vectorization_short_regress.q.out | 15 +- .../clientpositive/vector_groupby_reduce.q.out | 79 +- .../vector_interval_mapjoin.q.out | 8 +- .../clientpositive/vector_outer_join3.q.out | 16 +- .../clientpositive/vectorization_13.q.out | 28 +- .../clientpositive/vectorization_14.q.out | 10 +- .../clientpositive/vectorization_15.q.out | 14 +- .../vectorization_short_regress.q.out | 15 +- 81 files changed, 4212 insertions(+), 1851 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/hive/blob/4accba2f/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java ---------------------------------------------------------------------- diff --git a/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java b/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java index eea65e4..2ebab4b 100644 --- a/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java +++ b/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java @@ -1008,6 +1008,8 @@ public class HiveConf extends Configuration { HIVE_CBO_COST_MODEL_HDFS_READ("hive.cbo.costmodel.hdfs.read", "1.5", "Default cost of reading a byte from HDFS;" + " expressed as multiple of Local FS read cost"), AGGR_JOIN_TRANSPOSE("hive.transpose.aggr.join", false, "push aggregates through join"), + HIVE_COLUMN_ALIGNMENT("hive.order.columnalignment", true, "Flag to control whether we want to try to align" + + "columns in operators such as Aggregate or Join so that we try to reduce the number of shuffling stages"), // hive.mapjoin.bucket.cache.size has been replaced by hive.smbjoin.cache.row, // need to remove by hive .13. Also, do not change default (see SMB operator) http://git-wip-us.apache.org/repos/asf/hive/blob/4accba2f/ql/src/java/org/apache/hadoop/hive/ql/optimizer/LimitPushdownOptimizer.java ---------------------------------------------------------------------- diff --git a/ql/src/java/org/apache/hadoop/hive/ql/optimizer/LimitPushdownOptimizer.java b/ql/src/java/org/apache/hadoop/hive/ql/optimizer/LimitPushdownOptimizer.java index 3e72ba5..f68d0ad 100644 --- a/ql/src/java/org/apache/hadoop/hive/ql/optimizer/LimitPushdownOptimizer.java +++ b/ql/src/java/org/apache/hadoop/hive/ql/optimizer/LimitPushdownOptimizer.java @@ -184,10 +184,12 @@ public class LimitPushdownOptimizer extends Transform { return false; } // Copy order - pRS.getConf().setOrder(cRS.getConf().getOrder().substring( - 0, pRS.getConf().getOrder().length())); - pRS.getConf().setNullOrder(cRS.getConf().getNullOrder().substring( - 0, pRS.getConf().getNullOrder().length())); + StringBuilder order = new StringBuilder(cRS.getConf().getOrder()); + StringBuilder orderNull = new StringBuilder(cRS.getConf().getNullOrder()); + order.append(pRS.getConf().getOrder().substring(order.length())); + orderNull.append(pRS.getConf().getNullOrder().substring(orderNull.length())); + pRS.getConf().setOrder(order.toString()); + pRS.getConf().setNullOrder(orderNull.toString()); // Copy limit pRS.getConf().setTopN(cRS.getConf().getTopN()); pRS.getConf().setTopNMemoryUsage(cRS.getConf().getTopNMemoryUsage()); @@ -210,10 +212,10 @@ public class LimitPushdownOptimizer extends Transform { if (pKeys == null || pKeys.isEmpty()) { return false; } - if (cKeys.size() < pKeys.size()) { + if (cKeys.size() > pKeys.size()) { return false; } - for (int i = 0; i < pKeys.size(); i++) { + for (int i = 0; i < cKeys.size(); i++) { ExprNodeDesc expr = ExprNodeDescUtils.backtrack(cKeys.get(i), cRS, pRS); if (expr == null) { // cKey is not present in parent http://git-wip-us.apache.org/repos/asf/hive/blob/4accba2f/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/reloperators/HiveAggregate.java ---------------------------------------------------------------------- diff --git a/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/reloperators/HiveAggregate.java b/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/reloperators/HiveAggregate.java index 9cb62c8..dc6b152 100644 --- a/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/reloperators/HiveAggregate.java +++ b/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/reloperators/HiveAggregate.java @@ -17,6 +17,7 @@ */ package org.apache.hadoop.hive.ql.optimizer.calcite.reloperators; +import java.util.LinkedHashSet; import java.util.List; import java.util.Set; @@ -41,6 +42,9 @@ import com.google.common.collect.Sets; public class HiveAggregate extends Aggregate implements HiveRelNode { + private LinkedHashSet aggregateColumnsOrder; + + public HiveAggregate(RelOptCluster cluster, RelTraitSet traitSet, RelNode child, boolean indicator, ImmutableBitSet groupSet, List groupSets, List aggCalls) { @@ -126,4 +130,12 @@ public class HiveAggregate extends Aggregate implements HiveRelNode { return builder.build(); } + public void setAggregateColumnsOrder(LinkedHashSet aggregateColumnsOrder) { + this.aggregateColumnsOrder = aggregateColumnsOrder; + } + + public LinkedHashSet getAggregateColumnsOrder() { + return this.aggregateColumnsOrder; + } + } http://git-wip-us.apache.org/repos/asf/hive/blob/4accba2f/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelColumnsAlignment.java ---------------------------------------------------------------------- diff --git a/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelColumnsAlignment.java b/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelColumnsAlignment.java new file mode 100644 index 0000000..f35bf2f --- /dev/null +++ b/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelColumnsAlignment.java @@ -0,0 +1,262 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.hive.ql.optimizer.calcite.rules; + +import java.util.ArrayList; +import java.util.HashMap; +import java.util.HashSet; +import java.util.LinkedHashSet; +import java.util.List; +import java.util.Map; +import java.util.Map.Entry; +import java.util.Set; + +import org.apache.calcite.plan.RelOptUtil; +import org.apache.calcite.rel.RelFieldCollation; +import org.apache.calcite.rel.RelNode; +import org.apache.calcite.rel.core.Aggregate; +import org.apache.calcite.rel.core.Filter; +import org.apache.calcite.rel.core.Join; +import org.apache.calcite.rel.core.Project; +import org.apache.calcite.rel.core.SetOp; +import org.apache.calcite.rel.core.Sort; +import org.apache.calcite.rex.RexCall; +import org.apache.calcite.rex.RexInputRef; +import org.apache.calcite.rex.RexNode; +import org.apache.calcite.rex.RexOver; +import org.apache.calcite.rex.RexUtil; +import org.apache.calcite.sql.SqlKind; +import org.apache.calcite.tools.RelBuilder; +import org.apache.calcite.util.ReflectUtil; +import org.apache.calcite.util.ReflectiveVisitor; +import org.apache.hadoop.hive.ql.optimizer.calcite.reloperators.HiveAggregate; + +import com.google.common.collect.ImmutableList; + + +/** + * This class infers the order in Aggregate columns and the order of conjuncts + * in a Join condition that might be more beneficial to avoid additional sort + * stages. The only visible change is that order of join conditions might change. + * Further, Aggregate operators might get annotated with order in which Aggregate + * columns should be generated when we transform the operator tree into AST or + * Hive operator tree. + */ +public class HiveRelColumnsAlignment implements ReflectiveVisitor { + + private final ReflectUtil.MethodDispatcher alignDispatcher; + private final RelBuilder relBuilder; + + + /** + * Creates a HiveRelColumnsAlignment. + */ + public HiveRelColumnsAlignment(RelBuilder relBuilder) { + this.relBuilder = relBuilder; + this.alignDispatcher = + ReflectUtil.createMethodDispatcher( + RelNode.class, + this, + "align", + RelNode.class, + List.class); + } + + + /** + * Execute the logic in this class. In particular, make a top-down traversal of the tree + * and annotate and recreate appropiate operators. + */ + public RelNode align(RelNode root) { + final RelNode newRoot = dispatchAlign(root, ImmutableList.of()); + return newRoot; + } + + protected final RelNode dispatchAlign(RelNode node, List collations) { + return alignDispatcher.invoke(node, collations); + } + + public RelNode align(Aggregate rel, List collations) { + // 1) We extract the group by positions that are part of the collations and + // sort them so they respect it + LinkedHashSet aggregateColumnsOrder = new LinkedHashSet<>(); + ImmutableList.Builder propagateCollations = ImmutableList.builder(); + if (!rel.indicator && !collations.isEmpty()) { + for (RelFieldCollation c : collations) { + if (c.getFieldIndex() < rel.getGroupCount()) { + // Group column found + if (aggregateColumnsOrder.add(c.getFieldIndex())) { + propagateCollations.add(c.copy(rel.getGroupSet().nth(c.getFieldIndex()))); + } + } + } + } + for (int i = 0; i < rel.getGroupCount(); i++) { + if (!aggregateColumnsOrder.contains(i)) { + // Not included in the input collations, but can be propagated as this Aggregate + // will enforce it + propagateCollations.add(new RelFieldCollation(rel.getGroupSet().nth(i))); + } + } + + // 2) We propagate + final RelNode child = dispatchAlign(rel.getInput(), propagateCollations.build()); + + // 3) We annotate the Aggregate operator with this info + final HiveAggregate newAggregate = (HiveAggregate) rel.copy(rel.getTraitSet(), + ImmutableList.of(child)); + newAggregate.setAggregateColumnsOrder(aggregateColumnsOrder); + return newAggregate; + } + + public RelNode align(Join rel, List collations) { + ImmutableList.Builder propagateCollationsLeft = ImmutableList.builder(); + ImmutableList.Builder propagateCollationsRight = ImmutableList.builder(); + final int nLeftColumns = rel.getLeft().getRowType().getFieldList().size(); + Map idxToConjuncts = new HashMap<>(); + Map refToRef = new HashMap<>(); + // 1) We extract the conditions that can be useful + List conjuncts = new ArrayList<>(); + List otherConjuncts = new ArrayList<>(); + for (RexNode conj : RelOptUtil.conjunctions(rel.getCondition())) { + if (conj.getKind() != SqlKind.EQUALS) { + otherConjuncts.add(conj); + continue; + } + // TODO: Currently we only support EQUAL operator on two references. + // We might extend the logic to support other (order-preserving) + // UDFs here. + RexCall equals = (RexCall) conj; + if (!(equals.getOperands().get(0) instanceof RexInputRef) || + !(equals.getOperands().get(1) instanceof RexInputRef)) { + otherConjuncts.add(conj); + continue; + } + RexInputRef ref0 = (RexInputRef) equals.getOperands().get(0); + RexInputRef ref1 = (RexInputRef) equals.getOperands().get(1); + if ((ref0.getIndex() < nLeftColumns && ref1.getIndex() >= nLeftColumns) || + (ref1.getIndex() < nLeftColumns && ref0.getIndex() >= nLeftColumns)) { + // We made sure the references are for different join inputs + idxToConjuncts.put(ref0.getIndex(), equals); + idxToConjuncts.put(ref1.getIndex(), equals); + refToRef.put(ref0.getIndex(), ref1.getIndex()); + refToRef.put(ref1.getIndex(), ref0.getIndex()); + } else { + otherConjuncts.add(conj); + } + } + + // 2) We extract the collation for this operator and the collations + // that we will propagate to the inputs of the join + for (RelFieldCollation c : collations) { + RexNode equals = idxToConjuncts.get(c.getFieldIndex()); + if (equals != null) { + conjuncts.add(equals); + idxToConjuncts.remove(c.getFieldIndex()); + idxToConjuncts.remove(refToRef.get(c.getFieldIndex())); + if (c.getFieldIndex() < nLeftColumns) { + propagateCollationsLeft.add(c.copy(c.getFieldIndex())); + propagateCollationsRight.add(c.copy(refToRef.get(c.getFieldIndex()) - nLeftColumns)); + } else { + propagateCollationsLeft.add(c.copy(refToRef.get(c.getFieldIndex()))); + propagateCollationsRight.add(c.copy(c.getFieldIndex() - nLeftColumns)); + } + } + } + final Set visited = new HashSet<>(); + for (Entry e : idxToConjuncts.entrySet()) { + if (visited.add(e.getValue())) { + // Not included in the input collations, but can be propagated as this Join + // might enforce it + conjuncts.add(e.getValue()); + if (e.getKey() < nLeftColumns) { + propagateCollationsLeft.add(new RelFieldCollation(e.getKey())); + propagateCollationsRight.add(new RelFieldCollation(refToRef.get(e.getKey()) - nLeftColumns)); + } else { + propagateCollationsLeft.add(new RelFieldCollation(refToRef.get(e.getKey()))); + propagateCollationsRight.add(new RelFieldCollation(e.getKey() - nLeftColumns)); + } + } + } + conjuncts.addAll(otherConjuncts); + + // 3) We propagate + final RelNode newLeftInput = dispatchAlign(rel.getLeft(), propagateCollationsLeft.build()); + final RelNode newRightInput = dispatchAlign(rel.getRight(), propagateCollationsRight.build()); + + // 4) We change the Join operator to reflect this info + final RelNode newJoin = rel.copy(rel.getTraitSet(), RexUtil.composeConjunction( + relBuilder.getRexBuilder(), conjuncts, false), newLeftInput, newRightInput, + rel.getJoinType(), rel.isSemiJoinDone()); + return newJoin; + } + + public RelNode align(SetOp rel, List collations) { + ImmutableList.Builder newInputs = new ImmutableList.Builder<>(); + for (RelNode input : rel.getInputs()) { + newInputs.add(dispatchAlign(input, collations)); + } + return rel.copy(rel.getTraitSet(), newInputs.build()); + } + + public RelNode align(Project rel, List collations) { + // 1) We extract the collations indices + boolean containsWindowing = false; + for (RexNode childExp : rel.getChildExps()) { + if (childExp instanceof RexOver) { + // TODO: support propagation for partitioning/ordering in windowing + containsWindowing = true; + break; + } + } + ImmutableList.Builder propagateCollations = ImmutableList.builder(); + if (!containsWindowing) { + for (RelFieldCollation c : collations) { + RexNode rexNode = rel.getChildExps().get(c.getFieldIndex()); + if (rexNode instanceof RexInputRef) { + int newIdx = ((RexInputRef) rexNode).getIndex(); + propagateCollations.add(c.copy((newIdx))); + } + } + } + // 2) We propagate + final RelNode child = dispatchAlign(rel.getInput(), propagateCollations.build()); + // 3) Return new Project + return rel.copy(rel.getTraitSet(), ImmutableList.of(child)); + } + + public RelNode align(Filter rel, List collations) { + final RelNode child = dispatchAlign(rel.getInput(), collations); + return rel.copy(rel.getTraitSet(), ImmutableList.of(child)); + } + + public RelNode align(Sort rel, List collations) { + final RelNode child = dispatchAlign(rel.getInput(), rel.collation.getFieldCollations()); + return rel.copy(rel.getTraitSet(), ImmutableList.of(child)); + } + + // Catch-all rule when none of the others apply. + public RelNode align(RelNode rel, List collations) { + ImmutableList.Builder newInputs = new ImmutableList.Builder<>(); + for (RelNode input : rel.getInputs()) { + newInputs.add(dispatchAlign(input, ImmutableList.of())); + } + return rel.copy(rel.getTraitSet(), newInputs.build()); + } + +} http://git-wip-us.apache.org/repos/asf/hive/blob/4accba2f/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/translator/ASTConverter.java ---------------------------------------------------------------------- diff --git a/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/translator/ASTConverter.java b/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/translator/ASTConverter.java index 353d8db..40215a2 100644 --- a/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/translator/ASTConverter.java +++ b/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/translator/ASTConverter.java @@ -56,6 +56,7 @@ import org.apache.calcite.util.ImmutableBitSet; import org.apache.hadoop.hive.metastore.api.FieldSchema; import org.apache.hadoop.hive.ql.metadata.VirtualColumn; import org.apache.hadoop.hive.ql.optimizer.calcite.CalciteSemanticException; +import org.apache.hadoop.hive.ql.optimizer.calcite.reloperators.HiveAggregate; import org.apache.hadoop.hive.ql.optimizer.calcite.reloperators.HiveGroupingID; import org.apache.hadoop.hive.ql.optimizer.calcite.reloperators.HiveSortLimit; import org.apache.hadoop.hive.ql.optimizer.calcite.reloperators.HiveTableScan; @@ -90,9 +91,9 @@ public class ASTConverter { this.derivedTableCount = dtCounterInitVal; } - public static ASTNode convert(final RelNode relNode, List resultSchema) + public static ASTNode convert(final RelNode relNode, List resultSchema, boolean alignColumns) throws CalciteSemanticException { - RelNode root = PlanModifierForASTConv.convertOpTree(relNode, resultSchema); + RelNode root = PlanModifierForASTConv.convertOpTree(relNode, resultSchema, alignColumns); ASTConverter c = new ASTConverter(root, 0); return c.convert(); } @@ -142,11 +143,19 @@ public class ASTConverter { b = ASTBuilder.construct(HiveParser.TOK_GROUPBY, "TOK_GROUPBY"); } - for (int i : groupBy.getGroupSet()) { - RexInputRef iRef = new RexInputRef(i, groupBy.getCluster().getTypeFactory() - .createSqlType(SqlTypeName.ANY)); + HiveAggregate hiveAgg = (HiveAggregate) groupBy; + for (int pos : hiveAgg.getAggregateColumnsOrder()) { + RexInputRef iRef = new RexInputRef(groupBy.getGroupSet().nth(pos), + groupBy.getCluster().getTypeFactory().createSqlType(SqlTypeName.ANY)); b.add(iRef.accept(new RexVisitor(schema))); } + for (int pos = 0; pos < groupBy.getGroupCount(); pos++) { + if (!hiveAgg.getAggregateColumnsOrder().contains(pos)) { + RexInputRef iRef = new RexInputRef(groupBy.getGroupSet().nth(pos), + groupBy.getCluster().getTypeFactory().createSqlType(SqlTypeName.ANY)); + b.add(iRef.accept(new RexVisitor(schema))); + } + } //Grouping sets expressions if(groupingSetsExpression) { http://git-wip-us.apache.org/repos/asf/hive/blob/4accba2f/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/translator/PlanModifierForASTConv.java ---------------------------------------------------------------------- diff --git a/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/translator/PlanModifierForASTConv.java b/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/translator/PlanModifierForASTConv.java index 1a543fb..9db7727 100644 --- a/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/translator/PlanModifierForASTConv.java +++ b/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/translator/PlanModifierForASTConv.java @@ -40,16 +40,18 @@ import org.apache.calcite.rex.RexNode; import org.apache.calcite.rex.RexOver; import org.apache.calcite.sql.SqlAggFunction; import org.apache.calcite.util.Pair; -import org.slf4j.Logger; -import org.slf4j.LoggerFactory; import org.apache.hadoop.hive.metastore.api.FieldSchema; import org.apache.hadoop.hive.ql.optimizer.calcite.CalciteSemanticException; import org.apache.hadoop.hive.ql.optimizer.calcite.HiveCalciteUtil; +import org.apache.hadoop.hive.ql.optimizer.calcite.HiveRelFactories; import org.apache.hadoop.hive.ql.optimizer.calcite.reloperators.HiveAggregate; import org.apache.hadoop.hive.ql.optimizer.calcite.reloperators.HiveProject; import org.apache.hadoop.hive.ql.optimizer.calcite.reloperators.HiveSortLimit; import org.apache.hadoop.hive.ql.optimizer.calcite.reloperators.HiveTableScan; +import org.apache.hadoop.hive.ql.optimizer.calcite.rules.HiveRelColumnsAlignment; import org.apache.hadoop.hive.serde2.typeinfo.TypeInfoFactory; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; import com.google.common.collect.ImmutableList; @@ -58,7 +60,7 @@ public class PlanModifierForASTConv { private static final Logger LOG = LoggerFactory.getLogger(PlanModifierForASTConv.class); - public static RelNode convertOpTree(RelNode rel, List resultSchema) + public static RelNode convertOpTree(RelNode rel, List resultSchema, boolean alignColumns) throws CalciteSemanticException { RelNode newTopNode = rel; if (LOG.isDebugEnabled()) { @@ -78,6 +80,15 @@ public class PlanModifierForASTConv { LOG.debug("Plan after nested convertOpTree\n " + RelOptUtil.toString(newTopNode)); } + if (alignColumns) { + HiveRelColumnsAlignment propagator = new HiveRelColumnsAlignment( + HiveRelFactories.HIVE_BUILDER.create(newTopNode.getCluster(), null)); + newTopNode = propagator.align(newTopNode); + if (LOG.isDebugEnabled()) { + LOG.debug("Plan after propagating order\n " + RelOptUtil.toString(newTopNode)); + } + } + Pair topSelparentPair = HiveCalciteUtil.getTopLevelSelect(newTopNode); PlanModifierUtil.fixTopOBSchema(newTopNode, topSelparentPair, resultSchema, true); if (LOG.isDebugEnabled()) { http://git-wip-us.apache.org/repos/asf/hive/blob/4accba2f/ql/src/java/org/apache/hadoop/hive/ql/optimizer/correlation/ReduceSinkDeDuplication.java ---------------------------------------------------------------------- diff --git a/ql/src/java/org/apache/hadoop/hive/ql/optimizer/correlation/ReduceSinkDeDuplication.java b/ql/src/java/org/apache/hadoop/hive/ql/optimizer/correlation/ReduceSinkDeDuplication.java index 77771c3..d53efbf 100644 --- a/ql/src/java/org/apache/hadoop/hive/ql/optimizer/correlation/ReduceSinkDeDuplication.java +++ b/ql/src/java/org/apache/hadoop/hive/ql/optimizer/correlation/ReduceSinkDeDuplication.java @@ -34,6 +34,7 @@ import org.apache.hadoop.hive.metastore.api.FieldSchema; import org.apache.hadoop.hive.ql.exec.GroupByOperator; import org.apache.hadoop.hive.ql.exec.JoinOperator; import org.apache.hadoop.hive.ql.exec.Operator; +import org.apache.hadoop.hive.ql.exec.PTFOperator; import org.apache.hadoop.hive.ql.exec.ReduceSinkOperator; import org.apache.hadoop.hive.ql.exec.SelectOperator; import org.apache.hadoop.hive.ql.lib.DefaultGraphWalker; @@ -205,7 +206,7 @@ public class ReduceSinkDeDuplication extends Transform { return false; } - Integer moveRSOrderTo = checkOrder(cRSc.getOrder(), pRSNc.getOrder(), + Integer moveRSOrderTo = checkOrder(true, cRSc.getOrder(), pRSNc.getOrder(), cRSc.getNullOrder(), pRSNc.getNullOrder()); if (moveRSOrderTo == null) { return false; @@ -304,6 +305,16 @@ public class ReduceSinkDeDuplication extends Transform { } pRS.getConf().setOrder(cRS.getConf().getOrder()); pRS.getConf().setNullOrder(cRS.getConf().getNullOrder()); + } else { + // The sorting order of the parent RS is more specific or they are equal. + // We will copy the order from the child RS, and then fill in the order + // of the rest of columns with the one taken from parent RS. + StringBuilder order = new StringBuilder(cRS.getConf().getOrder()); + StringBuilder orderNull = new StringBuilder(cRS.getConf().getNullOrder()); + order.append(pRS.getConf().getOrder().substring(order.length())); + orderNull.append(pRS.getConf().getNullOrder().substring(orderNull.length())); + pRS.getConf().setOrder(order.toString()); + pRS.getConf().setNullOrder(orderNull.toString()); } if (result[3] > 0) { @@ -342,7 +353,9 @@ public class ReduceSinkDeDuplication extends Transform { throws SemanticException { ReduceSinkDesc cConf = cRS.getConf(); ReduceSinkDesc pConf = pRS.getConf(); - Integer moveRSOrderTo = checkOrder(cConf.getOrder(), pConf.getOrder(), + // If there is a PTF between cRS and pRS we cannot ignore the order direction + final boolean checkStrictEquality = isStrictEqualityNeeded(cRS, pRS); + Integer moveRSOrderTo = checkOrder(checkStrictEquality, cConf.getOrder(), pConf.getOrder(), cConf.getNullOrder(), pConf.getNullOrder()); if (moveRSOrderTo == null) { return null; @@ -370,6 +383,18 @@ public class ReduceSinkDeDuplication extends Transform { moveReducerNumTo, moveNumDistKeyTo}; } + private boolean isStrictEqualityNeeded(ReduceSinkOperator cRS, ReduceSinkOperator pRS) { + Operator parent = cRS.getParentOperators().get(0); + while (parent != pRS) { + assert parent.getNumParent() == 1; + if (parent instanceof PTFOperator) { + return true; + } + parent = parent.getParentOperators().get(0); + } + return false; + } + private Integer checkNumDistributionKey(int cnd, int pnd) { // number of distribution keys of cRS is chosen only when numDistKeys of pRS // is 0 or less. In all other cases, distribution of the keys is based on @@ -452,8 +477,7 @@ public class ReduceSinkDeDuplication extends Transform { return Integer.valueOf(cexprs.size()).compareTo(pexprs.size()); } - // order of overlapping keys should be exactly the same - protected Integer checkOrder(String corder, String porder, + protected Integer checkOrder(boolean checkStrictEquality, String corder, String porder, String cNullOrder, String pNullOrder) { assert corder.length() == cNullOrder.length(); assert porder.length() == pNullOrder.length(); @@ -468,12 +492,15 @@ public class ReduceSinkDeDuplication extends Transform { } corder = corder.trim(); porder = porder.trim(); - cNullOrder = cNullOrder.trim(); - pNullOrder = pNullOrder.trim(); - int target = Math.min(corder.length(), porder.length()); - if (!corder.substring(0, target).equals(porder.substring(0, target)) || - !cNullOrder.substring(0, target).equals(pNullOrder.substring(0, target))) { - return null; + if (checkStrictEquality) { + // order of overlapping keys should be exactly the same + cNullOrder = cNullOrder.trim(); + pNullOrder = pNullOrder.trim(); + int target = Math.min(corder.length(), porder.length()); + if (!corder.substring(0, target).equals(porder.substring(0, target)) || + !cNullOrder.substring(0, target).equals(pNullOrder.substring(0, target))) { + return null; + } } return Integer.valueOf(corder.length()).compareTo(porder.length()); } http://git-wip-us.apache.org/repos/asf/hive/blob/4accba2f/ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java ---------------------------------------------------------------------- diff --git a/ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java b/ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java index 0d4c1bb..fb43e7d 100644 --- a/ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java +++ b/ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java @@ -715,7 +715,8 @@ public class CalcitePlanner extends SemanticAnalyzer { rethrowCalciteException(e); throw new AssertionError("rethrowCalciteException didn't throw for " + e.getMessage()); } - optiqOptimizedAST = ASTConverter.convert(optimizedOptiqPlan, resultSchema); + optiqOptimizedAST = ASTConverter.convert(optimizedOptiqPlan, resultSchema, + HiveConf.getBoolVar(conf, HiveConf.ConfVars.HIVE_COLUMN_ALIGNMENT)); return optiqOptimizedAST; } http://git-wip-us.apache.org/repos/asf/hive/blob/4accba2f/ql/src/test/queries/clientpositive/correlationoptimizer13.q ---------------------------------------------------------------------- diff --git a/ql/src/test/queries/clientpositive/correlationoptimizer13.q b/ql/src/test/queries/clientpositive/correlationoptimizer13.q index f1faf2a..495ff58 100644 --- a/ql/src/test/queries/clientpositive/correlationoptimizer13.q +++ b/ql/src/test/queries/clientpositive/correlationoptimizer13.q @@ -7,7 +7,6 @@ set hive.optimize.correlation=true; -- The query in this file have operators with same set of keys -- but having different sorting orders. -- Correlation optimizer currently do not optimize this case. --- This case will be optimized latter (need a follow-up jira). EXPLAIN SELECT xx.key1, xx.key2, yy.key1, yy.key2, xx.cnt, yy.cnt http://git-wip-us.apache.org/repos/asf/hive/blob/4accba2f/ql/src/test/queries/clientpositive/limit_pushdown2.q ---------------------------------------------------------------------- diff --git a/ql/src/test/queries/clientpositive/limit_pushdown2.q b/ql/src/test/queries/clientpositive/limit_pushdown2.q new file mode 100644 index 0000000..637b5b0 --- /dev/null +++ b/ql/src/test/queries/clientpositive/limit_pushdown2.q @@ -0,0 +1,78 @@ +set hive.mapred.mode=nonstrict; +set hive.explain.user=false; +set hive.limit.pushdown.memory.usage=0.3f; +set hive.optimize.reducededuplication.min.reducer=1; + +explain +select key, value, avg(key + 1) from src +group by key, value +order by key, value limit 20; + +select key, value, avg(key + 1) from src +group by key, value +order by key, value limit 20; + +explain +select key, value, avg(key + 1) from src +group by key, value +order by key, value desc limit 20; + +select key, value, avg(key + 1) from src +group by key, value +order by key, value desc limit 20; + +explain +select key, value, avg(key + 1) from src +group by key, value +order by key desc, value limit 20; + +select key, value, avg(key + 1) from src +group by key, value +order by key desc, value limit 20; + +explain +select key, value, avg(key + 1) from src +group by value, key +order by key, value limit 20; + +select key, value, avg(key + 1) from src +group by value, key +order by key, value limit 20; + +explain +select key, value, avg(key + 1) from src +group by value, key +order by key desc, value limit 20; + +select key, value, avg(key + 1) from src +group by value, key +order by key desc, value limit 20; + +explain +select key, value, avg(key + 1) from src +group by value, key +order by key desc limit 20; + +select key, value, avg(key + 1) from src +group by value, key +order by key desc limit 20; + +-- NOT APPLICABLE +explain +select value, avg(key + 1) myavg from src +group by value +order by myavg, value desc limit 20; + +select value, avg(key + 1) myavg from src +group by value +order by myavg, value desc limit 20; + +-- NOT APPLICABLE +explain +select key, value, avg(key + 1) from src +group by value, key with rollup +order by key, value limit 20; + +select key, value, avg(key + 1) from src +group by value, key with rollup +order by key, value limit 20; http://git-wip-us.apache.org/repos/asf/hive/blob/4accba2f/ql/src/test/queries/clientpositive/reduce_deduplicate_extended2.q ---------------------------------------------------------------------- diff --git a/ql/src/test/queries/clientpositive/reduce_deduplicate_extended2.q b/ql/src/test/queries/clientpositive/reduce_deduplicate_extended2.q new file mode 100644 index 0000000..cd67f4c --- /dev/null +++ b/ql/src/test/queries/clientpositive/reduce_deduplicate_extended2.q @@ -0,0 +1,92 @@ +set hive.mapred.mode=nonstrict; +set hive.explain.user=false; +set hive.auto.convert.join=false; +set hive.auto.convert.join.noconditionaltask=false; +set hive.convert.join.bucket.mapjoin.tez=false; +set hive.optimize.dynamic.partition.hashjoin=false; +set hive.limit.pushdown.memory.usage=0.3f; +set hive.optimize.reducededuplication.min.reducer=1; + +-- JOIN + GBY +EXPLAIN +SELECT f.key, g.value +FROM src f +JOIN src g ON (f.key = g.key AND f.value = g.value) +GROUP BY g.value, f.key; + +-- JOIN + GBY + OBY +EXPLAIN +SELECT g.key, f.value +FROM src f +JOIN src g ON (f.key = g.key AND f.value = g.value) +GROUP BY g.key, f.value +ORDER BY f.value, g.key; + +-- GBY + JOIN + GBY +EXPLAIN +SELECT f.key, g.value +FROM src f +JOIN ( + SELECT key, value + FROM src + GROUP BY key, value) g +ON (f.key = g.key AND f.value = g.value) +GROUP BY g.value, f.key; + +-- 2GBY + JOIN + GBY +EXPLAIN +SELECT f.key, g.value +FROM ( + SELECT key, value + FROM src + GROUP BY value, key) f +JOIN ( + SELECT key, value + FROM src + GROUP BY key, value) g +ON (f.key = g.key AND f.value = g.value) +GROUP BY g.value, f.key; + +-- 2GBY + JOIN + GBY + OBY +EXPLAIN +SELECT f.key, g.value +FROM ( + SELECT value + FROM src + GROUP BY value) g +JOIN ( + SELECT key + FROM src + GROUP BY key) f +GROUP BY g.value, f.key +ORDER BY f.key desc, g.value; + +-- 2(2GBY + JOIN + GBY + OBY) + UNION +EXPLAIN +SELECT x.key, x.value +FROM ( + SELECT f.key, g.value + FROM ( + SELECT key, value + FROM src + GROUP BY key, value) f + JOIN ( + SELECT key, value + FROM src + GROUP BY value, key) g + ON (f.key = g.key AND f.value = g.value) + GROUP BY g.value, f.key +UNION ALL + SELECT f.key, g.value + FROM ( + SELECT key, value + FROM src + GROUP BY value, key) f + JOIN ( + SELECT key, value + FROM src + GROUP BY key, value) g + ON (f.key = g.key AND f.value = g.value) + GROUP BY f.key, g.value +) x +ORDER BY x.value desc, x.key desc; http://git-wip-us.apache.org/repos/asf/hive/blob/4accba2f/ql/src/test/results/clientpositive/annotate_stats_join.q.out ---------------------------------------------------------------------- diff --git a/ql/src/test/results/clientpositive/annotate_stats_join.q.out b/ql/src/test/results/clientpositive/annotate_stats_join.q.out index d83d7db..223a7ce 100644 --- a/ql/src/test/results/clientpositive/annotate_stats_join.q.out +++ b/ql/src/test/results/clientpositive/annotate_stats_join.q.out @@ -244,9 +244,9 @@ STAGE PLANS: outputColumnNames: _col0, _col1, _col2 Statistics: Num rows: 48 Data size: 4752 Basic stats: COMPLETE Column stats: COMPLETE Reduce Output Operator - key expressions: _col1 (type: int), _col0 (type: string) + key expressions: _col0 (type: string), _col1 (type: int) sort order: ++ - Map-reduce partition columns: _col1 (type: int), _col0 (type: string) + Map-reduce partition columns: _col0 (type: string), _col1 (type: int) Statistics: Num rows: 48 Data size: 4752 Basic stats: COMPLETE Column stats: COMPLETE value expressions: _col2 (type: int) TableScan @@ -260,17 +260,17 @@ STAGE PLANS: outputColumnNames: _col0, _col1 Statistics: Num rows: 6 Data size: 570 Basic stats: COMPLETE Column stats: COMPLETE Reduce Output Operator - key expressions: _col0 (type: int), _col1 (type: string) + key expressions: _col1 (type: string), _col0 (type: int) sort order: ++ - Map-reduce partition columns: _col0 (type: int), _col1 (type: string) + Map-reduce partition columns: _col1 (type: string), _col0 (type: int) Statistics: Num rows: 6 Data size: 570 Basic stats: COMPLETE Column stats: COMPLETE Reduce Operator Tree: Join Operator condition map: Inner Join 0 to 1 keys: - 0 _col1 (type: int), _col0 (type: string) - 1 _col0 (type: int), _col1 (type: string) + 0 _col0 (type: string), _col1 (type: int) + 1 _col1 (type: string), _col0 (type: int) outputColumnNames: _col0, _col1, _col2, _col3, _col4 Statistics: Num rows: 6 Data size: 1164 Basic stats: COMPLETE Column stats: COMPLETE File Output Operator @@ -310,9 +310,9 @@ STAGE PLANS: outputColumnNames: _col0, _col1, _col2 Statistics: Num rows: 48 Data size: 4752 Basic stats: COMPLETE Column stats: COMPLETE Reduce Output Operator - key expressions: _col1 (type: int), _col0 (type: string) + key expressions: _col0 (type: string), _col1 (type: int) sort order: ++ - Map-reduce partition columns: _col1 (type: int), _col0 (type: string) + Map-reduce partition columns: _col0 (type: string), _col1 (type: int) Statistics: Num rows: 48 Data size: 4752 Basic stats: COMPLETE Column stats: COMPLETE value expressions: _col2 (type: int) TableScan @@ -326,17 +326,17 @@ STAGE PLANS: outputColumnNames: _col0, _col1 Statistics: Num rows: 6 Data size: 570 Basic stats: COMPLETE Column stats: COMPLETE Reduce Output Operator - key expressions: _col0 (type: int), _col1 (type: string) + key expressions: _col1 (type: string), _col0 (type: int) sort order: ++ - Map-reduce partition columns: _col0 (type: int), _col1 (type: string) + Map-reduce partition columns: _col1 (type: string), _col0 (type: int) Statistics: Num rows: 6 Data size: 570 Basic stats: COMPLETE Column stats: COMPLETE Reduce Operator Tree: Join Operator condition map: Inner Join 0 to 1 keys: - 0 _col1 (type: int), _col0 (type: string) - 1 _col0 (type: int), _col1 (type: string) + 0 _col0 (type: string), _col1 (type: int) + 1 _col1 (type: string), _col0 (type: int) outputColumnNames: _col0, _col1, _col2, _col3, _col4 Statistics: Num rows: 6 Data size: 1164 Basic stats: COMPLETE Column stats: COMPLETE File Output Operator @@ -380,9 +380,9 @@ STAGE PLANS: outputColumnNames: _col0, _col1, _col2 Statistics: Num rows: 48 Data size: 4752 Basic stats: COMPLETE Column stats: COMPLETE Reduce Output Operator - key expressions: _col1 (type: int), _col0 (type: string), _col0 (type: string) - sort order: +++ - Map-reduce partition columns: _col1 (type: int), _col0 (type: string), _col0 (type: string) + key expressions: _col0 (type: string), _col1 (type: int) + sort order: ++ + Map-reduce partition columns: _col0 (type: string), _col1 (type: int) Statistics: Num rows: 48 Data size: 4752 Basic stats: COMPLETE Column stats: COMPLETE value expressions: _col2 (type: int) TableScan @@ -396,22 +396,22 @@ STAGE PLANS: outputColumnNames: _col0, _col1 Statistics: Num rows: 6 Data size: 570 Basic stats: COMPLETE Column stats: COMPLETE Reduce Output Operator - key expressions: _col0 (type: int), _col1 (type: string), _col1 (type: string) - sort order: +++ - Map-reduce partition columns: _col0 (type: int), _col1 (type: string), _col1 (type: string) + key expressions: _col1 (type: string), _col0 (type: int) + sort order: ++ + Map-reduce partition columns: _col1 (type: string), _col0 (type: int) Statistics: Num rows: 6 Data size: 570 Basic stats: COMPLETE Column stats: COMPLETE Reduce Operator Tree: Join Operator condition map: Inner Join 0 to 1 keys: - 0 _col1 (type: int), _col0 (type: string), _col0 (type: string) - 1 _col0 (type: int), _col1 (type: string), _col1 (type: string) + 0 _col0 (type: string), _col1 (type: int) + 1 _col1 (type: string), _col0 (type: int) outputColumnNames: _col0, _col1, _col2, _col3, _col4 - Statistics: Num rows: 11 Data size: 2134 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 6 Data size: 1164 Basic stats: COMPLETE Column stats: COMPLETE File Output Operator compressed: false - Statistics: Num rows: 11 Data size: 2134 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 6 Data size: 1164 Basic stats: COMPLETE Column stats: COMPLETE table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat @@ -626,9 +626,9 @@ STAGE PLANS: outputColumnNames: _col0, _col1, _col2 Statistics: Num rows: 48 Data size: 4752 Basic stats: COMPLETE Column stats: COMPLETE Reduce Output Operator - key expressions: _col1 (type: int), _col0 (type: string) + key expressions: _col0 (type: string), _col1 (type: int) sort order: ++ - Map-reduce partition columns: _col1 (type: int), _col0 (type: string) + Map-reduce partition columns: _col0 (type: string), _col1 (type: int) Statistics: Num rows: 48 Data size: 4752 Basic stats: COMPLETE Column stats: COMPLETE value expressions: _col2 (type: int) TableScan @@ -642,9 +642,9 @@ STAGE PLANS: outputColumnNames: _col0, _col1 Statistics: Num rows: 6 Data size: 570 Basic stats: COMPLETE Column stats: COMPLETE Reduce Output Operator - key expressions: _col0 (type: int), _col1 (type: string) + key expressions: _col1 (type: string), _col0 (type: int) sort order: ++ - Map-reduce partition columns: _col0 (type: int), _col1 (type: string) + Map-reduce partition columns: _col1 (type: string), _col0 (type: int) Statistics: Num rows: 6 Data size: 570 Basic stats: COMPLETE Column stats: COMPLETE TableScan alias: l @@ -657,9 +657,9 @@ STAGE PLANS: outputColumnNames: _col0, _col1, _col2, _col3 Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE Reduce Output Operator - key expressions: _col1 (type: int), _col0 (type: string) + key expressions: _col0 (type: string), _col1 (type: int) sort order: ++ - Map-reduce partition columns: _col1 (type: int), _col0 (type: string) + Map-reduce partition columns: _col0 (type: string), _col1 (type: int) Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE value expressions: _col2 (type: bigint), _col3 (type: int) Reduce Operator Tree: @@ -668,9 +668,9 @@ STAGE PLANS: Inner Join 0 to 1 Inner Join 0 to 2 keys: - 0 _col1 (type: int), _col0 (type: string) - 1 _col0 (type: int), _col1 (type: string) - 2 _col1 (type: int), _col0 (type: string) + 0 _col0 (type: string), _col1 (type: int) + 1 _col1 (type: string), _col0 (type: int) + 2 _col0 (type: string), _col1 (type: int) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8 Statistics: Num rows: 1 Data size: 296 Basic stats: COMPLETE Column stats: COMPLETE File Output Operator http://git-wip-us.apache.org/repos/asf/hive/blob/4accba2f/ql/src/test/results/clientpositive/bucket_groupby.q.out ---------------------------------------------------------------------- diff --git a/ql/src/test/results/clientpositive/bucket_groupby.q.out b/ql/src/test/results/clientpositive/bucket_groupby.q.out index 867fad4..f808bba 100644 --- a/ql/src/test/results/clientpositive/bucket_groupby.q.out +++ b/ql/src/test/results/clientpositive/bucket_groupby.q.out @@ -1547,12 +1547,12 @@ STAGE PLANS: alias: clustergroupby Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE Column stats: NONE Select Operator - expressions: value (type: string), key (type: string) - outputColumnNames: _col0, _col1 + expressions: key (type: string), value (type: string) + outputColumnNames: _col1, _col0 Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE Column stats: NONE Group By Operator aggregations: count(1) - keys: _col0 (type: string), _col1 (type: string) + keys: _col1 (type: string), _col0 (type: string) mode: hash outputColumnNames: _col0, _col1, _col2 Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE Column stats: NONE @@ -1561,6 +1561,7 @@ STAGE PLANS: sort order: ++ Map-reduce partition columns: _col0 (type: string), _col1 (type: string) Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE Column stats: NONE + TopN Hash Memory Usage: 0.1 value expressions: _col2 (type: bigint) Reduce Operator Tree: Group By Operator @@ -1570,7 +1571,7 @@ STAGE PLANS: outputColumnNames: _col0, _col1, _col2 Statistics: Num rows: 250 Data size: 2656 Basic stats: COMPLETE Column stats: NONE Select Operator - expressions: _col1 (type: string), _col2 (type: bigint) + expressions: _col0 (type: string), _col2 (type: bigint) outputColumnNames: _col0, _col1 Statistics: Num rows: 250 Data size: 2656 Basic stats: COMPLETE Column stats: NONE File Output Operator http://git-wip-us.apache.org/repos/asf/hive/blob/4accba2f/ql/src/test/results/clientpositive/correlationoptimizer13.q.out ---------------------------------------------------------------------- diff --git a/ql/src/test/results/clientpositive/correlationoptimizer13.q.out b/ql/src/test/results/clientpositive/correlationoptimizer13.q.out index ac5bdc6..240c1ad 100644 --- a/ql/src/test/results/clientpositive/correlationoptimizer13.q.out +++ b/ql/src/test/results/clientpositive/correlationoptimizer13.q.out @@ -23,7 +23,6 @@ POSTHOOK: Lineage: tmp.c4 SIMPLE [(src)x.FieldSchema(name:value, type:string, co PREHOOK: query: -- The query in this file have operators with same set of keys -- but having different sorting orders. -- Correlation optimizer currently do not optimize this case. --- This case will be optimized latter (need a follow-up jira). EXPLAIN SELECT xx.key1, xx.key2, yy.key1, yy.key2, xx.cnt, yy.cnt @@ -36,7 +35,6 @@ PREHOOK: type: QUERY POSTHOOK: query: -- The query in this file have operators with same set of keys -- but having different sorting orders. -- Correlation optimizer currently do not optimize this case. --- This case will be optimized latter (need a follow-up jira). EXPLAIN SELECT xx.key1, xx.key2, yy.key1, yy.key2, xx.cnt, yy.cnt @@ -48,10 +46,8 @@ ON (xx.key1 = yy.key1 AND xx.key2 == yy.key2) ORDER BY xx.key1, xx.key2, yy.key1 POSTHOOK: type: QUERY STAGE DEPENDENCIES: Stage-1 is a root stage - Stage-2 depends on stages: Stage-1, Stage-4 - Stage-3 depends on stages: Stage-2 - Stage-4 is a root stage - Stage-0 depends on stages: Stage-3 + Stage-2 depends on stages: Stage-1 + Stage-0 depends on stages: Stage-2 STAGE PLANS: Stage: Stage-1 @@ -64,140 +60,120 @@ STAGE PLANS: predicate: ((c1 < 120) and c3 is not null) (type: boolean) Statistics: Num rows: 342 Data size: 7639 Basic stats: COMPLETE Column stats: NONE Select Operator - expressions: c3 (type: string), c1 (type: int) - outputColumnNames: _col0, _col1 + expressions: c1 (type: int), c3 (type: string) + outputColumnNames: _col1, _col0 Statistics: Num rows: 342 Data size: 7639 Basic stats: COMPLETE Column stats: NONE Group By Operator aggregations: count(1) - keys: _col0 (type: string), _col1 (type: int) + keys: _col1 (type: int), _col0 (type: string) mode: hash outputColumnNames: _col0, _col1, _col2 Statistics: Num rows: 342 Data size: 7639 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator - key expressions: _col0 (type: string), _col1 (type: int) + key expressions: _col0 (type: int), _col1 (type: string) sort order: ++ - Map-reduce partition columns: _col0 (type: string), _col1 (type: int) + Map-reduce partition columns: _col0 (type: int), _col1 (type: string) Statistics: Num rows: 342 Data size: 7639 Basic stats: COMPLETE Column stats: NONE value expressions: _col2 (type: bigint) + TableScan + alias: x + Statistics: Num rows: 1028 Data size: 22964 Basic stats: COMPLETE Column stats: NONE + Filter Operator + predicate: ((c2 > 100) and (c1 < 120) and c3 is not null) (type: boolean) + Statistics: Num rows: 114 Data size: 2546 Basic stats: COMPLETE Column stats: NONE + Select Operator + expressions: c1 (type: int), c3 (type: string) + outputColumnNames: _col1, _col0 + Statistics: Num rows: 114 Data size: 2546 Basic stats: COMPLETE Column stats: NONE + Group By Operator + aggregations: count(1) + keys: _col1 (type: int), _col0 (type: string) + mode: hash + outputColumnNames: _col0, _col1, _col2 + Statistics: Num rows: 114 Data size: 2546 Basic stats: COMPLETE Column stats: NONE + Reduce Output Operator + key expressions: _col0 (type: int), _col1 (type: string) + sort order: ++ + Map-reduce partition columns: _col0 (type: int), _col1 (type: string) + Statistics: Num rows: 114 Data size: 2546 Basic stats: COMPLETE Column stats: NONE + value expressions: _col2 (type: bigint) Reduce Operator Tree: - Group By Operator - aggregations: count(VALUE._col0) - keys: KEY._col0 (type: string), KEY._col1 (type: int) - mode: mergepartial - outputColumnNames: _col0, _col1, _col2 - Statistics: Num rows: 171 Data size: 3819 Basic stats: COMPLETE Column stats: NONE - Select Operator - expressions: _col1 (type: int), _col0 (type: string), _col2 (type: bigint) + Demux Operator + Statistics: Num rows: 456 Data size: 10185 Basic stats: COMPLETE Column stats: NONE + Group By Operator + aggregations: count(VALUE._col0) + keys: KEY._col0 (type: int), KEY._col1 (type: string) + mode: mergepartial + outputColumnNames: _col0, _col1, _col2 + Statistics: Num rows: 228 Data size: 5092 Basic stats: COMPLETE Column stats: NONE + Mux Operator + Statistics: Num rows: 456 Data size: 10184 Basic stats: COMPLETE Column stats: NONE + Join Operator + condition map: + Inner Join 0 to 1 + keys: + 0 _col0 (type: int), _col1 (type: string) + 1 _col0 (type: int), _col1 (type: string) + outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5 + Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column stats: NONE + Select Operator + expressions: _col0 (type: int), _col1 (type: string), _col3 (type: int), _col4 (type: string), _col2 (type: bigint), _col5 (type: bigint) + outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5 + Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column stats: NONE + File Output Operator + compressed: false + table: + input format: org.apache.hadoop.mapred.SequenceFileInputFormat + output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat + serde: org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe + Group By Operator + aggregations: count(VALUE._col0) + keys: KEY._col0 (type: int), KEY._col1 (type: string) + mode: mergepartial outputColumnNames: _col0, _col1, _col2 - Statistics: Num rows: 171 Data size: 3819 Basic stats: COMPLETE Column stats: NONE - File Output Operator - compressed: false - table: - input format: org.apache.hadoop.mapred.SequenceFileInputFormat - output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat - serde: org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe + Statistics: Num rows: 228 Data size: 5092 Basic stats: COMPLETE Column stats: NONE + Mux Operator + Statistics: Num rows: 456 Data size: 10184 Basic stats: COMPLETE Column stats: NONE + Join Operator + condition map: + Inner Join 0 to 1 + keys: + 0 _col0 (type: int), _col1 (type: string) + 1 _col0 (type: int), _col1 (type: string) + outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5 + Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column stats: NONE + Select Operator + expressions: _col0 (type: int), _col1 (type: string), _col3 (type: int), _col4 (type: string), _col2 (type: bigint), _col5 (type: bigint) + outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5 + Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column stats: NONE + File Output Operator + compressed: false + table: + input format: org.apache.hadoop.mapred.SequenceFileInputFormat + output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat + serde: org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe Stage: Stage-2 Map Reduce Map Operator Tree: TableScan Reduce Output Operator - key expressions: _col0 (type: int), _col1 (type: string) - sort order: ++ - Map-reduce partition columns: _col0 (type: int), _col1 (type: string) - Statistics: Num rows: 171 Data size: 3819 Basic stats: COMPLETE Column stats: NONE - value expressions: _col2 (type: bigint) - TableScan - Reduce Output Operator - key expressions: _col0 (type: int), _col1 (type: string) - sort order: ++ - Map-reduce partition columns: _col0 (type: int), _col1 (type: string) - Statistics: Num rows: 57 Data size: 1273 Basic stats: COMPLETE Column stats: NONE - value expressions: _col2 (type: bigint) - Reduce Operator Tree: - Join Operator - condition map: - Inner Join 0 to 1 - keys: - 0 _col0 (type: int), _col1 (type: string) - 1 _col0 (type: int), _col1 (type: string) - outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5 - Statistics: Num rows: 188 Data size: 4200 Basic stats: COMPLETE Column stats: NONE - Select Operator - expressions: _col0 (type: int), _col1 (type: string), _col3 (type: int), _col4 (type: string), _col2 (type: bigint), _col5 (type: bigint) - outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5 - Statistics: Num rows: 188 Data size: 4200 Basic stats: COMPLETE Column stats: NONE - File Output Operator - compressed: false - table: - input format: org.apache.hadoop.mapred.SequenceFileInputFormat - output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat - serde: org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe - - Stage: Stage-3 - Map Reduce - Map Operator Tree: - TableScan - Reduce Output Operator key expressions: _col0 (type: int), _col1 (type: string), _col2 (type: int), _col3 (type: string), _col4 (type: bigint), _col5 (type: bigint) sort order: ++++++ - Statistics: Num rows: 188 Data size: 4200 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column stats: NONE Reduce Operator Tree: Select Operator expressions: KEY.reducesinkkey0 (type: int), KEY.reducesinkkey1 (type: string), KEY.reducesinkkey2 (type: int), KEY.reducesinkkey3 (type: string), KEY.reducesinkkey4 (type: bigint), KEY.reducesinkkey5 (type: bigint) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5 - Statistics: Num rows: 188 Data size: 4200 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column stats: NONE File Output Operator compressed: false - Statistics: Num rows: 188 Data size: 4200 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column stats: NONE table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe - Stage: Stage-4 - Map Reduce - Map Operator Tree: - TableScan - alias: x - Statistics: Num rows: 1028 Data size: 22964 Basic stats: COMPLETE Column stats: NONE - Filter Operator - predicate: ((c2 > 100) and (c1 < 120) and c3 is not null) (type: boolean) - Statistics: Num rows: 114 Data size: 2546 Basic stats: COMPLETE Column stats: NONE - Select Operator - expressions: c3 (type: string), c1 (type: int) - outputColumnNames: _col0, _col1 - Statistics: Num rows: 114 Data size: 2546 Basic stats: COMPLETE Column stats: NONE - Group By Operator - aggregations: count(1) - keys: _col0 (type: string), _col1 (type: int) - mode: hash - outputColumnNames: _col0, _col1, _col2 - Statistics: Num rows: 114 Data size: 2546 Basic stats: COMPLETE Column stats: NONE - Reduce Output Operator - key expressions: _col0 (type: string), _col1 (type: int) - sort order: ++ - Map-reduce partition columns: _col0 (type: string), _col1 (type: int) - Statistics: Num rows: 114 Data size: 2546 Basic stats: COMPLETE Column stats: NONE - value expressions: _col2 (type: bigint) - Reduce Operator Tree: - Group By Operator - aggregations: count(VALUE._col0) - keys: KEY._col0 (type: string), KEY._col1 (type: int) - mode: mergepartial - outputColumnNames: _col0, _col1, _col2 - Statistics: Num rows: 57 Data size: 1273 Basic stats: COMPLETE Column stats: NONE - Select Operator - expressions: _col1 (type: int), _col0 (type: string), _col2 (type: bigint) - outputColumnNames: _col0, _col1, _col2 - Statistics: Num rows: 57 Data size: 1273 Basic stats: COMPLETE Column stats: NONE - File Output Operator - compressed: false - table: - input format: org.apache.hadoop.mapred.SequenceFileInputFormat - output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat - serde: org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe - Stage: Stage-0 Fetch Operator limit: -1 http://git-wip-us.apache.org/repos/asf/hive/blob/4accba2f/ql/src/test/results/clientpositive/dynamic_rdd_cache.q.out ---------------------------------------------------------------------- diff --git a/ql/src/test/results/clientpositive/dynamic_rdd_cache.q.out b/ql/src/test/results/clientpositive/dynamic_rdd_cache.q.out index 4088a39..e2dbea3 100644 --- a/ql/src/test/results/clientpositive/dynamic_rdd_cache.q.out +++ b/ql/src/test/results/clientpositive/dynamic_rdd_cache.q.out @@ -1046,12 +1046,12 @@ STAGE PLANS: outputColumnNames: _col2, _col4, _col5, _col6 Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE Select Operator - expressions: _col4 (type: int), _col5 (type: int), _col6 (type: string), _col2 (type: int) - outputColumnNames: _col4, _col5, _col6, _col2 + expressions: _col5 (type: int), _col4 (type: int), _col6 (type: string), _col2 (type: int) + outputColumnNames: _col5, _col4, _col6, _col2 Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE Group By Operator aggregations: stddev_samp(_col2), avg(_col2) - keys: _col4 (type: int), _col5 (type: int), _col6 (type: string) + keys: _col5 (type: int), _col4 (type: int), _col6 (type: string) mode: hash outputColumnNames: _col0, _col1, _col2, _col3, _col4 Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE @@ -1080,7 +1080,7 @@ STAGE PLANS: outputColumnNames: _col0, _col1, _col2, _col3, _col4 Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE Select Operator - expressions: _col1 (type: int), _col0 (type: int), _col3 (type: double), _col4 (type: double) + expressions: _col0 (type: int), _col1 (type: int), _col3 (type: double), _col4 (type: double) outputColumnNames: _col1, _col2, _col3, _col4 Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE Filter Operator @@ -1102,16 +1102,16 @@ STAGE PLANS: Map Operator Tree: TableScan Reduce Output Operator - key expressions: _col2 (type: int), _col1 (type: int) + key expressions: _col1 (type: int), _col2 (type: int) sort order: ++ - Map-reduce partition columns: _col2 (type: int), _col1 (type: int) + Map-reduce partition columns: _col1 (type: int), _col2 (type: int) Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE value expressions: _col3 (type: double), _col4 (type: double) TableScan Reduce Output Operator - key expressions: _col2 (type: int), _col1 (type: int) + key expressions: _col1 (type: int), _col2 (type: int) sort order: ++ - Map-reduce partition columns: _col2 (type: int), _col1 (type: int) + Map-reduce partition columns: _col1 (type: int), _col2 (type: int) Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE value expressions: _col3 (type: double), _col4 (type: double) Reduce Operator Tree: @@ -1119,8 +1119,8 @@ STAGE PLANS: condition map: Inner Join 0 to 1 keys: - 0 _col2 (type: int), _col1 (type: int) - 1 _col2 (type: int), _col1 (type: int) + 0 _col1 (type: int), _col2 (type: int) + 1 _col1 (type: int), _col2 (type: int) outputColumnNames: _col1, _col2, _col3, _col4, _col6, _col7, _col8, _col9 Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE Select Operator @@ -1283,12 +1283,12 @@ STAGE PLANS: outputColumnNames: _col2, _col4, _col5, _col6 Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE Select Operator - expressions: _col4 (type: int), _col5 (type: int), _col6 (type: string), _col2 (type: int) - outputColumnNames: _col4, _col5, _col6, _col2 + expressions: _col5 (type: int), _col4 (type: int), _col6 (type: string), _col2 (type: int) + outputColumnNames: _col5, _col4, _col6, _col2 Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE Group By Operator aggregations: stddev_samp(_col2), avg(_col2) - keys: _col4 (type: int), _col5 (type: int), _col6 (type: string) + keys: _col5 (type: int), _col4 (type: int), _col6 (type: string) mode: hash outputColumnNames: _col0, _col1, _col2, _col3, _col4 Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE @@ -1317,7 +1317,7 @@ STAGE PLANS: outputColumnNames: _col0, _col1, _col2, _col3, _col4 Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE Select Operator - expressions: _col1 (type: int), _col0 (type: int), _col3 (type: double), _col4 (type: double) + expressions: _col0 (type: int), _col1 (type: int), _col3 (type: double), _col4 (type: double) outputColumnNames: _col1, _col2, _col3, _col4 Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE Filter Operator http://git-wip-us.apache.org/repos/asf/hive/blob/4accba2f/ql/src/test/results/clientpositive/filter_cond_pushdown.q.out ---------------------------------------------------------------------- diff --git a/ql/src/test/results/clientpositive/filter_cond_pushdown.q.out b/ql/src/test/results/clientpositive/filter_cond_pushdown.q.out index 46b701f..dc54bce 100644 --- a/ql/src/test/results/clientpositive/filter_cond_pushdown.q.out +++ b/ql/src/test/results/clientpositive/filter_cond_pushdown.q.out @@ -290,9 +290,9 @@ STAGE PLANS: outputColumnNames: _col0, _col1, _col2 Statistics: Num rows: 20 Data size: 262 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator - key expressions: UDFToDouble(_col0) (type: double), _col0 (type: string) + key expressions: _col0 (type: string), UDFToDouble(_col0) (type: double) sort order: ++ - Map-reduce partition columns: UDFToDouble(_col0) (type: double), _col0 (type: string) + Map-reduce partition columns: _col0 (type: string), UDFToDouble(_col0) (type: double) Statistics: Num rows: 20 Data size: 262 Basic stats: COMPLETE Column stats: NONE value expressions: _col1 (type: int), _col2 (type: float) TableScan @@ -306,9 +306,9 @@ STAGE PLANS: outputColumnNames: _col0, _col2 Statistics: Num rows: 10 Data size: 131 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator - key expressions: 1.0 (type: double), _col0 (type: string) + key expressions: _col0 (type: string), 1.0 (type: double) sort order: ++ - Map-reduce partition columns: 1.0 (type: double), _col0 (type: string) + Map-reduce partition columns: _col0 (type: string), 1.0 (type: double) Statistics: Num rows: 10 Data size: 131 Basic stats: COMPLETE Column stats: NONE value expressions: _col2 (type: float) Reduce Operator Tree: @@ -316,8 +316,8 @@ STAGE PLANS: condition map: Inner Join 0 to 1 keys: - 0 UDFToDouble(_col0) (type: double), _col0 (type: string) - 1 1.0 (type: double), _col0 (type: string) + 0 _col0 (type: string), UDFToDouble(_col0) (type: double) + 1 _col0 (type: string), 1.0 (type: double) outputColumnNames: _col0, _col1, _col2, _col5 Statistics: Num rows: 22 Data size: 288 Basic stats: COMPLETE Column stats: NONE Filter Operator