hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aihua Xu <...@cloudera.com>
Subject Re: Review Request 42508: HIVE-12889: Support COUNT(DISTINCT) for partitioning query.
Date Mon, 25 Jan 2016 16:36:41 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/42508/
-----------------------------------------------------------

(Updated Jan. 25, 2016, 4:36 p.m.)


Review request for hive, Chaoyu Tang, Szehon Ho, and Xuefu Zhang.


Changes
-------

Attached patch-3: simplify the copy and compare objects using ObjectInspectorUtils.


Repository: hive-git


Description
-------

HIVE-12889: Support COUNT(DISTINCT) for partitioning query.


Diffs (updated)
-----

  data/files/windowing_distinct.txt PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/functions/HiveSqlCountAggFunction.java
7937040 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/functions/HiveSqlSumAggFunction.java
8f62970 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/translator/PlanModifierForASTConv.java
e2fbb4f 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/translator/SqlFunctionConverter.java
37249f9 
  ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java 3fefbd7 
  ql/src/java/org/apache/hadoop/hive/ql/parse/IdentifiersParser.g 15ca754 
  ql/src/java/org/apache/hadoop/hive/ql/parse/PTFInvocationSpec.java 29b8510 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 5ff90a6 
  ql/src/java/org/apache/hadoop/hive/ql/parse/WindowingSpec.java a181f7c 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFCount.java eaf112e 
  ql/src/test/queries/clientpositive/windowing_distinct.q PRE-CREATION 
  ql/src/test/results/clientpositive/windowing_distinct.q.out PRE-CREATION 
  serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/ObjectInspectorUtils.java 7a13eb0


Diff: https://reviews.apache.org/r/42508/diff/


Testing
-------

Support count(distinct) over partitioning window. 

1. Enabling the parser to properly parse such query "count(distinct) over (partition by c1)";
2. ORDER BY and windowing frame won't work with the functions of distinct due to performance
concern and implementation requirement.
3. We insert the distinct fields into the order by list, so during counting, we only need
to compare the current row against the previous remembered row.


Thanks,

Aihua Xu


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message