hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jimmy Xiang" <jxi...@cloudera.com>
Subject Re: Review Request 28064: HIVE-8844 Choose a persisent policy for RDD caching [Spark Branch]
Date Sat, 15 Nov 2014 02:26:51 GMT


> On Nov. 15, 2014, 2:08 a.m., Szehon Ho wrote:
> > Looks good, some nits:

Thanks a lot for the review.


> On Nov. 15, 2014, 2:08 a.m., Szehon Ho wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkPlanGenerator.java, line 156
> > <https://reviews.apache.org/r/28064/diff/1/?file=764643#file764643line156>
> >
> >     Exception handling?

In case configuration error, a runtime exception will be thrown. I thought user can run the
query again. What do you think, should we catch it and use the default one?


> On Nov. 15, 2014, 2:08 a.m., Szehon Ho wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/ShuffleTran.java, line 39
> > <https://reviews.apache.org/r/28064/diff/1/?file=764642#file764642line39>
> >
> >     Do you know why we have to do this, and cannot just use NONE?

If we use NONE, we have to do 'hadoopRDD.mapToPair(new CopyFunction()).persist(storageLevel)'.
This looks like some extra work with no help. If it is null, we just use hadoopRDD, which
I think it is a little better.


> On Nov. 15, 2014, 2:08 a.m., Szehon Ho wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkPlanGenerator.java, line 155
> > <https://reviews.apache.org/r/28064/diff/1/?file=764643#file764643line155>
> >
> >     Instead of hardcoding, can we just use the StorageHandler.MEMORY_AND_DISK if
the string is null?
> 
> Szehon Ho wrote:
>     It would also be great if we start a file for defaults like this, (a bunch in SparkClient),
but understand its outside scope of this change.

Sure, will use use the StorageHandler.MEMORY_AND_DISK if the string is null.

Thought about puting these constants in a specific file. However, they are not re-used in
many places. We can handle it later I think.


- Jimmy


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/28064/#review61605
-----------------------------------------------------------


On Nov. 15, 2014, 12:32 a.m., Jimmy Xiang wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/28064/
> -----------------------------------------------------------
> 
> (Updated Nov. 15, 2014, 12:32 a.m.)
> 
> 
> Review request for hive and Xuefu Zhang.
> 
> 
> Bugs: HIVE-8844
>     https://issues.apache.org/jira/browse/HIVE-8844
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> Changed spark cache policy to be configurable with default memory+disk.
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/MapInput.java 79baea7 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/ShuffleTran.java 8565ba0 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkPlanGenerator.java 11f4236 
> 
> Diff: https://reviews.apache.org/r/28064/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Jimmy Xiang
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message