spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alexander Pivovarov (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (SPARK-12655) GraphX does not unpersist RDDs
Date Wed, 06 Jan 2016 21:45:39 GMT

     [ https://issues.apache.org/jira/browse/SPARK-12655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Alexander Pivovarov updated SPARK-12655:
----------------------------------------
    Affects Version/s: 1.6.0
          Description: 
Looks like Graph does not clean all RDDs from the cache on unpersist
{code}
// open spark-shell 1.5.2 or 1.6.0
// run

import org.apache.spark.graphx._

val vert = sc.parallelize(List((1L, 1), (2L, 2), (3L, 3)), 1)
val edges = sc.parallelize(List(Edge[Long](1L, 2L), Edge[Long](1L, 3L)), 1)

val g0 = Graph(vert, edges)
val g = g0.partitionBy(PartitionStrategy.EdgePartition2D, 2)
val cc = g.connectedComponents()

cc.unpersist()
g.unpersist()
g0.unpersist()
vert.unpersist()
edges.unpersist()
{code}
open http://localhost:4040/storage/
Spark UI 4040 Storage page still shows 2 items
{code}
VertexRDD      Memory Deserialized 1x Replicated   1  100%    1688.0 B   0.0 B  0.0 B
EdgeRDD        Memory Deserialized 1x Replicated   2  100%      4.7 KB   0.0 B  0.0 B
{code}

  was:
Looks like Graph does not clean all RDDs from the cache on unpersist
{code}
// open spark-shell 1.5.2
// run

import org.apache.spark.graphx._

val vert = sc.parallelize(List((1L, 1), (2L, 2), (3L, 3)), 1)
val edges = sc.parallelize(List(Edge[Long](1L, 2L), Edge[Long](1L, 3L)), 1)

val g0 = Graph(vert, edges)
val g = g0.partitionBy(PartitionStrategy.EdgePartition2D, 2)
val cc = g.connectedComponents()

cc.unpersist()
g.unpersist()
g0.unpersist()
vert.unpersist()
edges.unpersist()
{code}
open http://localhost:4040/storage/
Spark UI 4040 Storage page still shows 2 items
{code}
VertexRDD      Memory Deserialized 1x Replicated   1  100%    1688.0 B   0.0 B  0.0 B
EdgeRDD        Memory Deserialized 1x Replicated   2  100%      4.7 KB   0.0 B  0.0 B
{code}


> GraphX does not unpersist RDDs
> ------------------------------
>
>                 Key: SPARK-12655
>                 URL: https://issues.apache.org/jira/browse/SPARK-12655
>             Project: Spark
>          Issue Type: Bug
>          Components: GraphX
>    Affects Versions: 1.5.2, 1.6.0
>            Reporter: Alexander Pivovarov
>
> Looks like Graph does not clean all RDDs from the cache on unpersist
> {code}
> // open spark-shell 1.5.2 or 1.6.0
> // run
> import org.apache.spark.graphx._
> val vert = sc.parallelize(List((1L, 1), (2L, 2), (3L, 3)), 1)
> val edges = sc.parallelize(List(Edge[Long](1L, 2L), Edge[Long](1L, 3L)), 1)
> val g0 = Graph(vert, edges)
> val g = g0.partitionBy(PartitionStrategy.EdgePartition2D, 2)
> val cc = g.connectedComponents()
> cc.unpersist()
> g.unpersist()
> g0.unpersist()
> vert.unpersist()
> edges.unpersist()
> {code}
> open http://localhost:4040/storage/
> Spark UI 4040 Storage page still shows 2 items
> {code}
> VertexRDD      Memory Deserialized 1x Replicated   1  100%    1688.0 B   0.0 B  0.0 B
> EdgeRDD        Memory Deserialized 1x Replicated   2  100%      4.7 KB   0.0 B  0.0 B
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message