hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jimmy Xiang" <jxi...@cloudera.com>
Subject Re: Review Request 30739: HIVE-9574 Lazy computing in HiveBaseFunctionResultList may hurt performance [Spark Branch]
Date Tue, 10 Feb 2015 17:24:29 GMT


> On Feb. 10, 2015, 3:24 p.m., Xuefu Zhang wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveKVResultCache.java, line 56
> > <https://reviews.apache.org/r/30739/diff/5/?file=858858#file858858line56>
> >
> >     This one is also better to be private, if not used outside this class.

It is used in the unit test.


> On Feb. 10, 2015, 3:24 p.m., Xuefu Zhang wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveKVResultCache.java, line 107
> > <https://reviews.apache.org/r/30739/diff/5/?file=858858#file858858line107>
> >
> >     Is it possible to have fd leak, if "new Output()" fails?

Right, fixed.


> On Feb. 10, 2015, 3:24 p.m., Xuefu Zhang wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveKVResultCache.java, line 135
> > <https://reviews.apache.org/r/30739/diff/5/?file=858858#file858858line135>
> >
> >     Nit: could we move the constructor to the top, after the member variables?

Done.


- Jimmy


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30739/#review71790
-----------------------------------------------------------


On Feb. 9, 2015, 7:41 p.m., Jimmy Xiang wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/30739/
> -----------------------------------------------------------
> 
> (Updated Feb. 9, 2015, 7:41 p.m.)
> 
> 
> Review request for hive, Rui Li and Xuefu Zhang.
> 
> 
> Bugs: HIVE-9574
>     https://issues.apache.org/jira/browse/HIVE-9574
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> Result KV cache doesn't use RowContainer any more since it has logic we don't need, which
is some overhead. We don't do lazy computing right away, instead we wait a little till the
cache is close to spill.
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveBaseFunctionResultList.java 78ab680

>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveKVResultCache.java 8ead0cb 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveMapFunction.java 7a09b4d 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveMapFunctionResultList.java e92e299

>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveReduceFunction.java 070ea4d 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveReduceFunctionResultList.java
d4ff37c 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/KryoSerializer.java 286816b 
>   ql/src/test/org/apache/hadoop/hive/ql/exec/spark/TestHiveKVResultCache.java 0df4598

> 
> Diff: https://reviews.apache.org/r/30739/diff/
> 
> 
> Testing
> -------
> 
> Unit test, test on cluster
> 
> 
> Thanks,
> 
> Jimmy Xiang
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message