hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Szehon Ho" <sze...@cloudera.com>
Subject Re: Review Request 25906: HIVE-7856 : Enable parallelism in Reduce Side Join [Spark Branch]
Date Thu, 25 Sep 2014 00:39:36 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25906/
-----------------------------------------------------------

(Updated Sept. 25, 2014, 12:39 a.m.)


Review request for hive.


Changes
-------

Fix test failures.  

1. Trying to use 'SORT_BEFORE_DIFF' for the newly-added parallel tests, to get deterministic
results.  
2. Fixed some bugs in SparkEdgeProperty.


Bugs: HIVE-7856
    https://issues.apache.org/jira/browse/HIVE-7856


Repository: hive-git


Description
-------

This work is to consume the new API provided by SPARK-2978 called 'repartitionAndSortWithinPartitions'.

Now we need to make a distinction between old sort-by which is a total-order sort, vs this
one which does partition-level sort.  So added a new SparkEdge type for the same.  Only if
its partition-level sort do we call this API.  This will be the case, of course, for reduce-side
join.


Diffs (updated)
-----

  itests/src/test/resources/testconfiguration.properties 637fbc1 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SortByShuffler.java 446e3cc 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkPlanGenerator.java 7ab2ca0 
  ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkUtils.java ed06a57 
  ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkWork.java 4f889db 
  ql/src/java/org/apache/hadoop/hive/ql/plan/SparkEdgeProperty.java bdfef87 
  ql/src/test/queries/clientpositive/parallel_join0.q PRE-CREATION 
  ql/src/test/queries/clientpositive/parallel_join1.q PRE-CREATION 
  ql/src/test/results/clientpositive/spark/char_join1.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/spark/join0.q.out 913f57a 
  ql/src/test/results/clientpositive/spark/join1.q.out 9db644b 
  ql/src/test/results/clientpositive/spark/join10.q.out 5122c56 
  ql/src/test/results/clientpositive/spark/join11.q.out f4a080f 
  ql/src/test/results/clientpositive/spark/join12.q.out 1b5992f 
  ql/src/test/results/clientpositive/spark/join13.q.out c64bdb3 
  ql/src/test/results/clientpositive/spark/join14.q.out 9dcc6c8 
  ql/src/test/results/clientpositive/spark/join15.q.out ca7b5c5 
  ql/src/test/results/clientpositive/spark/join16.q.out 3a57bf5 
  ql/src/test/results/clientpositive/spark/join17.q.out 7c6d9ff 
  ql/src/test/results/clientpositive/spark/join18.q.out 3278dde 
  ql/src/test/results/clientpositive/spark/join19.q.out 87606fd 
  ql/src/test/results/clientpositive/spark/join2.q.out 0c3880b 
  ql/src/test/results/clientpositive/spark/join20.q.out 56b4bed 
  ql/src/test/results/clientpositive/spark/join21.q.out 0e08bf8 
  ql/src/test/results/clientpositive/spark/join22.q.out 1c8ab7c 
  ql/src/test/results/clientpositive/spark/join23.q.out ecd8371 
  ql/src/test/results/clientpositive/spark/join25.q.out 71df358 
  ql/src/test/results/clientpositive/spark/join26.q.out 06246a4 
  ql/src/test/results/clientpositive/spark/join27.q.out 8cbe599 
  ql/src/test/results/clientpositive/spark/join3.q.out 2f47a21 
  ql/src/test/results/clientpositive/spark/join4.q.out 48ea655 
  ql/src/test/results/clientpositive/spark/join5.q.out d1130fe 
  ql/src/test/results/clientpositive/spark/join6.q.out bfbe240 
  ql/src/test/results/clientpositive/spark/join7.q.out 1f5a4cc 
  ql/src/test/results/clientpositive/spark/join8.q.out 70782cc 
  ql/src/test/results/clientpositive/spark/join9.q.out f0c4172 
  ql/src/test/results/clientpositive/spark/join_nullsafe.q.out 48d5d76 
  ql/src/test/results/clientpositive/spark/parallel_join0.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/spark/parallel_join1.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/25906/diff/


Testing
-------

Adding a few tests that force reducers > 1, manually verified results.


Thanks,

Szehon Ho


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message