hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Xuefu Zhang" <xzh...@cloudera.com>
Subject Re: Review Request 34666: HIVE-9152 - Dynamic Partition Pruning [Spark Branch]
Date Mon, 06 Jul 2015 23:08:28 GMT


> On May 27, 2015, 6:52 p.m., Xuefu Zhang wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkUtilities.java, line 177
> > <https://reviews.apache.org/r/34666/diff/1/?file=971700#file971700line177>
> >
> >     Any chance that an op might be visited multiple times?
> 
> Chao Sun wrote:
>     It shouldn't - it'a tree traversing and every operator should only be added once.

Actually there could be a diamond shape in the operator graph such as that formed by demux
and mux operators. Join operator is another example. We should use graph traverse instead
of tree traverse.


> On May 27, 2015, 6:52 p.m., Xuefu Zhang wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/io/CombineHiveInputFormat.java, line 519
> > <https://reviews.apache.org/r/34666/diff/1/?file=971702#file971702line519>
> >
> >     numThread could be <= 0?
> 
> Chao Sun wrote:
>     It could equal to 0, since getInputPaths() could return 0. This would result an IAE
from newFixedThreadPool.

Maybe there is a problem as you described, but I think that's irrelavent to the work here.
Thus, we should create a separate JIRA to fix that instead of including it here.


- Xuefu


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34666/#review85230
-----------------------------------------------------------


On July 3, 2015, 10:45 p.m., Chao Sun wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34666/
> -----------------------------------------------------------
> 
> (Updated July 3, 2015, 10:45 p.m.)
> 
> 
> Review request for hive, chengxiang li and Xuefu Zhang.
> 
> 
> Bugs: HIVE-9152
>     https://issues.apache.org/jira/browse/HIVE-9152
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> Tez implemented dynamic partition pruning in HIVE-7826. This is a nice optimization and
we should implement the same in HOS.
> 
> 
> Diffs
> -----
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 43c53fc 
>   itests/src/test/resources/testconfiguration.properties 2a5f7e3 
>   ql/if/queryplan.thrift c8dfa35 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/OperatorFactory.java 91e8a02 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveSparkClientFactory.java 21398d8

>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkDynamicPartitionPruner.java PRE-CREATION

>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkUtilities.java e6c845c 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorSparkPartitionPruningSinkOperator.java
PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/io/CombineHiveInputFormat.java 1de7e40 
>   ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java 9d5730d 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/DynamicPartitionPruningOptimization.java
8546d21 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/Optimizer.java ea5efe5 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/RemoveDynamicPruningBySize.java 4803959

>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java 5f731d7 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkPartitionPruningSinkDesc.java
PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkProcContext.java 447f104

>   ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkUtils.java e27ce0d 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/spark/OptimizeSparkProcContext.java f7586a4

>   ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkCompiler.java 19aae70 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkPartitionPruningSinkOperator.java
PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SplitOpTreeForDPP.java PRE-CREATION

>   ql/src/java/org/apache/hadoop/hive/ql/plan/MapWork.java 05a5841 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/TableScanDesc.java aa291b9 
>   ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java 363e49e 
>   ql/src/test/queries/clientpositive/spark_dynamic_partition_pruning.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/spark_dynamic_partition_pruning_2.q PRE-CREATION

>   ql/src/test/results/clientpositive/spark/bucket2.q.out 89c3b4c 
>   ql/src/test/results/clientpositive/spark/bucket3.q.out 2fc4855 
>   ql/src/test/results/clientpositive/spark/bucket4.q.out 44e0f9f 
>   ql/src/test/results/clientpositive/spark/column_access_stats.q.out 3e16f61 
>   ql/src/test/results/clientpositive/spark/limit_partition_metadataonly.q.out e95d2ab

>   ql/src/test/results/clientpositive/spark/list_bucket_dml_2.q.java1.7.out e38ccf8 
>   ql/src/test/results/clientpositive/spark/optimize_nullscan.q.out 881f41a 
>   ql/src/test/results/clientpositive/spark/pcr.q.out 4c22f0b 
>   ql/src/test/results/clientpositive/spark/sample3.q.out 2fe6b0d 
>   ql/src/test/results/clientpositive/spark/sample9.q.out c9823f7 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_11.q.out c3f996f 
>   ql/src/test/results/clientpositive/spark/spark_dynamic_partition_pruning.q.out PRE-CREATION

>   ql/src/test/results/clientpositive/spark/spark_dynamic_partition_pruning_2.q.out PRE-CREATION

>   ql/src/test/results/clientpositive/spark/temp_table.q.out 16d663d 
>   ql/src/test/results/clientpositive/spark/udf_example_add.q.out 7916679 
>   ql/src/test/results/clientpositive/spark/udf_in_file.q.out c769d1f 
>   ql/src/test/results/clientpositive/spark/union_view.q.out 593ce40 
>   ql/src/test/results/clientpositive/spark/vector_elt.q.out 180ea15 
>   ql/src/test/results/clientpositive/spark/vector_string_concat.q.out 9ec8538 
>   ql/src/test/results/clientpositive/spark/vectorization_decimal_date.q.out bafd62f 
>   ql/src/test/results/clientpositive/spark/vectorization_div0.q.out 30d116f 
>   ql/src/test/results/clientpositive/spark/vectorized_case.q.out daf6ad3 
>   ql/src/test/results/clientpositive/spark/vectorized_dynamic_partition_pruning.q.out
PRE-CREATION 
>   ql/src/test/results/clientpositive/spark/vectorized_math_funcs.q.out 470d9a9 
>   ql/src/test/results/clientpositive/spark/vectorized_string_funcs.q.out ef98ae9 
> 
> Diff: https://reviews.apache.org/r/34666/diff/
> 
> 
> Testing
> -------
> 
> spark_dynamic_partition_pruning.q, spark_dynamic_partition_pruning_2.q - both are clone
from tez's test.
> 
> 
> Thanks,
> 
> Chao Sun
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message