hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "John Pullokkaran" <jpullokka...@hortonworks.com>
Subject Re: Review Request 39199: HIVE-12084 : Hive queries with ORDER BY and large LIMIT fails with OutOfMemoryError Java heap space
Date Wed, 14 Oct 2015 22:52:10 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/39199/#review102714
-----------------------------------------------------------



ql/src/java/org/apache/hadoop/hive/ql/exec/PTFTopNHash.java (line 180)
<https://reviews.apache.org/r/39199/#comment160465>

    This seems not right:
    "partitionHeaps" seems to be allocating memory for each key TopNHash. So we would need
to know how many distinct keys are present; i.e we need to know the NDV.
    For each Key we going to store TopN.
    
    If col stats is available then we should use it. Also we need to handle composite keys.
    
    If stats are not available then may be a configurable heuristics may be an option; i.e
lets say NDV is 10% of cardinality.


- John Pullokkaran


On Oct. 14, 2015, 6:10 p.m., Hari Sankar Sivarama Subramaniyan wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/39199/
> -----------------------------------------------------------
> 
> (Updated Oct. 14, 2015, 6:10 p.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan and John Pullokkaran.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> Please look at https://issues.apache.org/jira/browse/HIVE-12084
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/PTFTopNHash.java f93b420 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java e33c1d4 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/TopNHash.java 484006a 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/PhysicalOptimizer.java 49706b1

>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/VerifyTopNMemoryUsage.java
PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java a60527b 
>   ql/src/test/queries/clientpositive/topn.q PRE-CREATION 
>   ql/src/test/results/clientpositive/topn.q.out PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/39199/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Hari Sankar Sivarama Subramaniyan
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message