Return-Path: X-Original-To: apmail-hive-dev-archive@www.apache.org Delivered-To: apmail-hive-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 39E6A1078E for ; Sat, 15 Nov 2014 03:04:32 +0000 (UTC) Received: (qmail 84671 invoked by uid 500); 15 Nov 2014 03:04:31 -0000 Delivered-To: apmail-hive-dev-archive@hive.apache.org Received: (qmail 84591 invoked by uid 500); 15 Nov 2014 03:04:31 -0000 Mailing-List: contact dev-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hive.apache.org Delivered-To: mailing list dev@hive.apache.org Received: (qmail 84570 invoked by uid 99); 15 Nov 2014 03:04:31 -0000 Received: from reviews-vm.apache.org (HELO reviews.apache.org) (140.211.11.40) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 15 Nov 2014 03:04:31 +0000 Received: from reviews.apache.org (localhost [127.0.0.1]) by reviews.apache.org (Postfix) with ESMTP id F2312116414; Sat, 15 Nov 2014 03:04:30 +0000 (UTC) Content-Type: multipart/alternative; boundary="===============3708285015033660995==" MIME-Version: 1.0 Subject: Re: Review Request 28064: HIVE-8844 Choose a persisent policy for RDD caching [Spark Branch] From: "Chao Sun" To: "Xuefu Zhang" Cc: "Szehon Ho" , "Jimmy Xiang" , "hive" , "Chao Sun" Date: Sat, 15 Nov 2014 03:04:30 -0000 Message-ID: <20141115030430.1251.29847@reviews.apache.org> X-ReviewBoard-URL: https://reviews.apache.org Auto-Submitted: auto-generated Sender: "Chao Sun" X-ReviewGroup: hive X-ReviewRequest-URL: https://reviews.apache.org/r/28064/ X-Sender: "Chao Sun" References: <20141115023451.1250.88847@reviews.apache.org> In-Reply-To: <20141115023451.1250.88847@reviews.apache.org> Reply-To: "Chao Sun" X-ReviewRequest-Repository: hive-git --===============3708285015033660995== MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit > On Nov. 15, 2014, 2:34 a.m., Szehon Ho wrote: > > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/ShuffleTran.java, line 39 > > > > > > OK, does spark handle that if we pass NONE in by doing no-op? If that's the case, then maybe cleaner for our code in that case. I'm a bit confused what NONE means. > > > > If we dont want to call NONE due to side-effects, can we just change the HadoopRDD call to: > > > > storageHandler.equals(StorageHandler.NONE) ? hadoopRdd : ... > > > > Then the logic is centralized to there. > > Jimmy Xiang wrote: > Sure. Will fix it as suggested. Thanks. persist() also register the RDD for GC clean up, but there seem to have no extra cost besides that. Either way is fine to me. - Chao ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/28064/#review61616 ----------------------------------------------------------- On Nov. 15, 2014, 12:32 a.m., Jimmy Xiang wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/28064/ > ----------------------------------------------------------- > > (Updated Nov. 15, 2014, 12:32 a.m.) > > > Review request for hive and Xuefu Zhang. > > > Bugs: HIVE-8844 > https://issues.apache.org/jira/browse/HIVE-8844 > > > Repository: hive-git > > > Description > ------- > > Changed spark cache policy to be configurable with default memory+disk. > > > Diffs > ----- > > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/MapInput.java 79baea7 > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/ShuffleTran.java 8565ba0 > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkPlanGenerator.java 11f4236 > > Diff: https://reviews.apache.org/r/28064/diff/ > > > Testing > ------- > > > Thanks, > > Jimmy Xiang > > --===============3708285015033660995==--