hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Dere" <jd...@hortonworks.com>
Subject Re: Review Request 34059: HIVE-10673 Dynamically partitioned hash join for Tez
Date Fri, 15 May 2015 22:08:24 GMT


> On May 12, 2015, 6:26 a.m., Alexander Pivovarov wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/optimizer/ReduceSinkMapJoinProc.java, line
107
> > <https://reviews.apache.org/r/34059/diff/1/?file=955672#file955672line107>
> >
> >     ReduceSinkOperator uses Object.hashCode() and equals() methods.
> >     HashSet algo relies on hashCode/equals methods
> 
> Jason Dere wrote:
>     So that means equals() only works if it is the exact same ReduceSinkOperator object.
This should be ok for our usage, if we are referring to the same ReduceSinkOperator, we should
be using that exact same object.
> 
> Alexander Pivovarov wrote:
>     Do you want to use IdentityHashMap then?
>     This class implements the Map interface with a hash table, using reference-equality
in place of object-equality when comparing keys (and values). In other words, in an IdentityHashMap,
two keys k1 and k2 are considered equal if and only if (k1==k2)
> 
> Jason Dere wrote:
>     We're using a Set here as opposed to a Map. I'll change to use Sets.newIdentityHashSet()
from Guava.
> 
> Alexander Pivovarov wrote:
>     IdentityHashMap contains private KeySet class already
>     to get its instance you can call keySet() method
>     e.g.
>         IdentityHashMap<Integer, Object> rsMap = new IdentityHashMap<Integer,
Object>();
>         rsMap.put(1, null);
>         rsMap.put(2, null);
>         rsMap.put(3, null);
>         Set<Integer> rsSet = rsMap.keySet();
>         System.out.println(rsSet);
>         [3, 1, 2]

I do not see a need to change this further - it has already been changed to use an Identity-based
set


- Jason


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34059/#review83362
-----------------------------------------------------------


On May 15, 2015, 10:02 p.m., Jason Dere wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34059/
> -----------------------------------------------------------
> 
> (Updated May 15, 2015, 10:02 p.m.)
> 
> 
> Review request for hive, Matt McCline and Vikram Dixit Kumaraswamy.
> 
> 
> Bugs: HIVE-10673
>     https://issues.apache.org/jira/browse/HIVE-10673
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> Reduce-side hash join (using MapJoinOperator), where the Tez inputs to the reducer are
unsorted.
> 
> 
> Diffs
> -----
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java eff4d30 
>   itests/src/test/resources/testconfiguration.properties c79c36c 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java b1352f3 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java d7f1b42 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/KeyValuesAdapter.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/KeyValuesFromKeyValue.java PRE-CREATION

>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/KeyValuesFromKeyValues.java PRE-CREATION

>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/ReduceRecordProcessor.java 545d7c6 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/ReduceRecordSource.java cdabe3a 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorMapJoinOperator.java e9bd44a

>   ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/VectorMapJoinCommonOperator.java
af78776 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java bcffdbc 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/MapJoinProcessor.java 4d84f0f 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/ReduceSinkMapJoinProc.java f7e1dbc

>   ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezProcContext.java adc31ae 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezUtils.java 0edfc5d 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezWork.java 6db8220 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/BaseWork.java a342738 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDescUtils.java fb3c4a3 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/MapJoinDesc.java cee9100 
>   ql/src/test/queries/clientpositive/tez_dynpart_hashjoin_1.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/tez_dynpart_hashjoin_2.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/tez_vector_dynpart_hashjoin_1.q PRE-CREATION 
>   ql/src/test/results/clientpositive/tez/tez_dynpart_hashjoin_1.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/tez/tez_dynpart_hashjoin_2.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/tez/tez_vector_dynpart_hashjoin_1.q.out PRE-CREATION

> 
> Diff: https://reviews.apache.org/r/34059/diff/
> 
> 
> Testing
> -------
> 
> q-file tests added
> 
> 
> Thanks,
> 
> Jason Dere
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message