hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chao Sun <chao....@cloudera.com>
Subject Re: Review Request 55776: Eliminate unbounded memory usage for orderBy and groupBy in Hive on Spark
Date Fri, 20 Jan 2017 18:26:39 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55776/#review162449
-----------------------------------------------------------




ql/src/java/org/apache/hadoop/hive/ql/exec/spark/GroupByShuffler.java (line 31)
<https://reviews.apache.org/r/55776/#comment233782>

    Is it possible that `numPartitions` equals to 0?



ql/src/java/org/apache/hadoop/hive/ql/exec/spark/GroupByShuffler.java (line 34)
<https://reviews.apache.org/r/55776/#comment233785>

    I wonder whether this also has some extra cost comparing to the original `groupByKey`,
since it needs to sort all records by key in a single partition, right?


I think we also need to update ql/src/test/results/clientpositive/union_top_level.q.out

- Chao Sun


On Jan. 20, 2017, 6:07 p.m., Xuefu Zhang wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/55776/
> -----------------------------------------------------------
> 
> (Updated Jan. 20, 2017, 6:07 p.m.)
> 
> 
> Review request for hive, Chao Sun and Rui Li.
> 
> 
> Bugs: HIVE-15580
>     https://issues.apache.org/jira/browse/HIVE-15580
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> See JIRA description.
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/GroupByShuffler.java e128dd2 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveReduceFunction.java eeb4443 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveReduceFunctionResultList.java
d57cac4 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/ReduceTran.java 3d56876 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/ShuffleTran.java a774395 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SortByShuffler.java 997ab7e 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkPlanGenerator.java 66ffe5d 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkReduceRecordHandler.java 0d31e5f

>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkShuffler.java 40e251f 
>   ql/src/test/queries/clientpositive/union_top_level.q d93fe38 
>   ql/src/test/results/clientpositive/llap/union_top_level.q.out b48ab83 
>   ql/src/test/results/clientpositive/spark/lateral_view_explode2.q.out 65a6e3e 
>   ql/src/test/results/clientpositive/spark/union_remove_25.q.out 9fec1d4 
>   ql/src/test/results/clientpositive/spark/union_top_level.q.out c9cb5d3 
>   ql/src/test/results/clientpositive/spark/vector_outer_join5.q.out 9e1742f 
> 
> Diff: https://reviews.apache.org/r/55776/diff/
> 
> 
> Testing
> -------
> 
> All test passed
> 
> 
> Thanks,
> 
> Xuefu Zhang
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message