hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gunther Hagleitner" <ghagleit...@hortonworks.com>
Subject Re: Review Request 24427: HIVE-7616 pre-size mapjoin hashtable based on statistics
Date Thu, 07 Aug 2014 00:04:18 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24427/#review49830
-----------------------------------------------------------



common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
<https://reviews.apache.org/r/24427/#comment87207>

    I think once you cross 1000 characters some underscores help readability. Or drop stats
and estimate from the name.



ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinBytesTableContainer.java
<https://reviews.apache.org/r/24427/#comment87222>

    this would be good to know at the info level i think. also, you've copied the lines above
for the wrapper but not the logging.



ql/src/java/org/apache/hadoop/hive/ql/optimizer/ReduceSinkMapJoinProc.java
<https://reviews.apache.org/r/24427/#comment87209>

    You initialize with null - why use Long.MAX_VALUE here?



ql/src/java/org/apache/hadoop/hive/ql/optimizer/ReduceSinkMapJoinProc.java
<https://reviews.apache.org/r/24427/#comment87211>

    ditto



ql/src/java/org/apache/hadoop/hive/ql/optimizer/ReduceSinkMapJoinProc.java
<https://reviews.apache.org/r/24427/#comment87212>

    curlies per coding standard



ql/src/java/org/apache/hadoop/hive/ql/optimizer/ReduceSinkMapJoinProc.java
<https://reviews.apache.org/r/24427/#comment87221>

    i think this number needs to be adjusted for bucketed map join. otherwise you'll over
allocate in that case, but we take the bucketing into consideration when we do size estimation
for the overall operator...



ql/src/java/org/apache/hadoop/hive/ql/plan/MapJoinDesc.java
<https://reviews.apache.org/r/24427/#comment87208>

    todo... put jira number, fix it or drop.



ql/src/java/org/apache/hadoop/hive/ql/plan/MapJoinDesc.java
<https://reviews.apache.org/r/24427/#comment87219>

    confusing name. we already have "stats" in each desc, which has multiple values. how about
parentToNumberKeyEstimate


- Gunther Hagleitner


On Aug. 6, 2014, 10 p.m., Sergey Shelukhin wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/24427/
> -----------------------------------------------------------
> 
> (Updated Aug. 6, 2014, 10 p.m.)
> 
> 
> Review request for hive, Gunther Hagleitner, Mostafa Mokhtar, and Prasanth_J.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> See jira
> 
> 
> Diffs
> -----
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 8490558 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/HashTableSinkOperator.java cf64aa0 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/BytesBytesMultiHashMap.java
cdb5dc5 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/HashMapWrapper.java 5b3b770

>   ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinBytesTableContainer.java
629457c 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HashTableLoader.java 6d292d0 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/ReduceSinkMapJoinProc.java 29d895a

>   ql/src/java/org/apache/hadoop/hive/ql/plan/MapJoinDesc.java 44cb9c0 
> 
> Diff: https://reviews.apache.org/r/24427/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Sergey Shelukhin
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message