Return-Path: X-Original-To: apmail-spark-user-archive@minotaur.apache.org Delivered-To: apmail-spark-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 187F9185AD for ; Sat, 9 Jan 2016 03:34:59 +0000 (UTC) Received: (qmail 32016 invoked by uid 500); 9 Jan 2016 03:34:53 -0000 Delivered-To: apmail-spark-user-archive@spark.apache.org Received: (qmail 31883 invoked by uid 500); 9 Jan 2016 03:34:53 -0000 Mailing-List: contact user-help@spark.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list user@spark.apache.org Received: (qmail 31873 invoked by uid 99); 9 Jan 2016 03:34:53 -0000 Received: from Unknown (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 09 Jan 2016 03:34:53 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 053C9C24F4 for ; Sat, 9 Jan 2016 03:34:52 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 4.293 X-Spam-Level: **** X-Spam-Status: No, score=4.293 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=3, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001, URIBL_BLOCKED=0.001, URI_HEX=1.313] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=databricks-com.20150623.gappssmtp.com Received: from mx1-us-west.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id fr1U9mePmeb4 for ; Sat, 9 Jan 2016 03:34:46 +0000 (UTC) Received: from mail-yk0-f180.google.com (mail-yk0-f180.google.com [209.85.160.180]) by mx1-us-west.apache.org (ASF Mail Server at mx1-us-west.apache.org) with ESMTPS id E55A720103 for ; Sat, 9 Jan 2016 03:34:45 +0000 (UTC) Received: by mail-yk0-f180.google.com with SMTP id x67so383327334ykd.2 for ; Fri, 08 Jan 2016 19:34:45 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=databricks-com.20150623.gappssmtp.com; s=20150623; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=T6O3z/8Quz5EbnN3jvwrXb7TCpbRhhKJal6m2JlMAXE=; b=RC76h8t6UPJeUg5ebG1q1ITJzV6uEyrHq40XUNqV91luCa2fZUhfvXlxnNb9mXbpGy 8VX1L5SHhKONXyI5RCMDby9yBUr0Hetj18NJqJ4MsbHhFxNVwrSWjxIyCZ/PXCJHFOMX ZFKHhaWCgxLNYA6gcrR8xvryit9GxZgzhp1VAri109iY/4TQmsL6gga+5G795WlWtvCG uDc47JOQ6h8cWkdcVkHdUrcg6ZetlxNU5aLjICy49kDuhO09tZhEFJct8uB5+OvIqGWD hVpe3qIlUQmmhAQMyGZJmyzK4LuTtXJos4GlPJHYtYzBodqc4Xm2N/k10qi0o+1jLLPS W2Tg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=T6O3z/8Quz5EbnN3jvwrXb7TCpbRhhKJal6m2JlMAXE=; b=PhaQ+CQvY0HReiZpyREFMS4g1FVHhTZjCuigknhD2o/HCTX3maopNtO2Bd9dVAe3eh KqTfu7Nm/4RUBKf60M2q3lH5SvnhT4jjTfiWjxmsN9F+b7OsxNq2YP5nKixdG6aq+rSN 2BeFGNWQ1ZeYyDuATVu3Q4mJDdQUe3QZ5xqBetC7/oEx4oMeq1+S732x76BimrGs2jWI q+ktomtJ0Q5FK5H2TCo7uLyCmvI4aV/T8lSxHT8lnJnJfQ3yILpbeQm/DNwyyOcSNeey eem/5Hgi9Kua48dwwVgknbYSTlAD31erM2VrtPrTjfjzs26F43FMeIUpypQKVIu4mK0t z+4g== X-Gm-Message-State: ALoCoQm22mGOiUeR0fNtDCnIWDIb7iQPMFaIAXFjowFA+VGGQGfbGp77YhwXVFsZBDgRvil1xQHgoXHmAHJJV6HZ+WFiFzCHQw== MIME-Version: 1.0 X-Received: by 10.13.237.129 with SMTP id w123mr82845904ywe.319.1452310479228; Fri, 08 Jan 2016 19:34:39 -0800 (PST) Received: by 10.37.24.65 with HTTP; Fri, 8 Jan 2016 19:34:39 -0800 (PST) In-Reply-To: <1452307828595-25926.post@n3.nabble.com> References: <1452307828595-25926.post@n3.nabble.com> Date: Fri, 8 Jan 2016 19:34:39 -0800 Message-ID: Subject: Re: how garbage collection works on parallelize From: Josh Rosen To: jluan Cc: user Content-Type: multipart/alternative; boundary=94eb2c08851e3803fa0528de623f --94eb2c08851e3803fa0528de623f Content-Type: text/plain; charset=UTF-8 It won't be GC'd as long as the RDD which results from `parallelize()` is kept around; that RDD keeps strong references to the parallelized collection's elements in order to enable fault-tolerance. On Fri, Jan 8, 2016 at 6:50 PM, jluan wrote: > Hi, > > I am curious about garbage collect on an object which gets parallelized. > Say > if we have a really large array (say 40GB in ram) that we want to > parallelize across our machines. > > I have the following function: > > def doSomething(): RDD[Double] = { > val reallyBigArray = Array[Double[(some really big value) > sc.parallelize(reallyBigArray) > } > > Theoretically, will reallyBigArray be marked for GC? Or will reallyBigArray > not be GC'd because parallelize somehow has a reference on reallyBigArray? > > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/how-garbage-collection-works-on-parallelize-tp25926.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscribe@spark.apache.org > For additional commands, e-mail: user-help@spark.apache.org > > --94eb2c08851e3803fa0528de623f Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
It won't be GC'd as long as the RDD which results = from `parallelize()` is kept around; that RDD keeps strong references to th= e parallelized collection's elements in order to enable fault-tolerance= .

On Fri, Ja= n 8, 2016 at 6:50 PM, jluan <jayluan7@gmail.com> wrote:
=
Hi,

I am curious about garbage collect on an object which gets parallelized. Sa= y
if we have a really large array (say 40GB in ram) that we want to
parallelize across our machines.

I have the following function:

def doSomething(): RDD[Double] =3D {
val reallyBigArray =3D Array[Double[(some really big value)
sc.parallelize(reallyBigArray)
}

Theoretically, will reallyBigArray be marked for GC? Or will reallyBigArray=
not be GC'd because parallelize somehow has a reference on reallyBigArr= ay?




--
View this message in context: http://apache-spark-user-list.1001560= .n3.nabble.com/how-garbage-collection-works-on-parallelize-tp25926.html=
Sent from the Apache Spark User List mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


--94eb2c08851e3803fa0528de623f--