spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris A. Mattmann (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-13634) Assigning spark context to variable results in serialization error
Date Tue, 08 Mar 2016 06:38:40 GMT

    [ https://issues.apache.org/jira/browse/SPARK-13634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15184536#comment-15184536
] 

Chris A. Mattmann commented on SPARK-13634:
-------------------------------------------

I'm CC'ed b/c I'm the PI of the SciSpark project and I asked Rahul to file this issue here.
It's not a toy example - it's a real example from our system. We have a work around but were
wondering if Apache Spark had thought of anything better or seen something similar. 

Our code is here: 
https://github.com/Scispark/scispark/

The question I was asking was related to etiquette. I don't think it's good etiquette to close
tickets under which the reporter has weighed in. This was closed literally in 43 minutes,
without even waiting for Rahul to chime back in. Is it really that urgent to close an issue
that a user has reported that quickly without hearing back from them to see if your suggestion
helped or answered their question?

> Assigning spark context to variable results in serialization error
> ------------------------------------------------------------------
>
>                 Key: SPARK-13634
>                 URL: https://issues.apache.org/jira/browse/SPARK-13634
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Shell
>            Reporter: Rahul Palamuttam
>            Priority: Minor
>
> The following lines of code cause a task serialization error when executed in the spark-shell.

> Note that the error does not occur when submitting the code as a batch job - via spark-submit.
> val temp = 10
> val newSC = sc
> val new RDD = newSC.parallelize(0 to 100).map(p => p + temp)
> For some reason when temp is being pulled in to the referencing environment of the closure,
so is the SparkContext. 
> We originally hit this issue in the SciSpark project, when referencing a string variable
inside of a lambda expression in RDD.map(...)
> Any insight into how this could be resolved would be appreciated.
> While the above code is trivial, SciSpark uses a wrapper around the SparkContext to read
from various file formats. We want to keep this class structure and also use it in notebook
and shell environments.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message