spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Apache Spark (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-19213) FileSourceScanExec usese sparksession from hadoopfsrelation creation time instead of the one active at time of execution
Date Fri, 13 Jan 2017 12:56:26 GMT

    [ https://issues.apache.org/jira/browse/SPARK-19213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15821736#comment-15821736
] 

Apache Spark commented on SPARK-19213:
--------------------------------------

User 'robert3005' has created a pull request for this issue:
https://github.com/apache/spark/pull/16575

> FileSourceScanExec usese sparksession from hadoopfsrelation creation time instead of
the one active at time of execution
> ------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-19213
>                 URL: https://issues.apache.org/jira/browse/SPARK-19213
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 2.1.0
>            Reporter: Robert Kruszewski
>
> If you look at https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala#L260
you'll notice that the sparksession used for execution is the one that was captured from logicalplan.
Whereas in other places you have https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala#L154
and SparkPlan captures active session upon execution in https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/SparkPlan.scala#L52
> From my understanding of the io code it would be beneficial to be able to use the active
session in order to be able to modify hadoop config without recreating the dataset. What would
be interesting is to not lock the spark session in the physical plan for ios and let you share
datasets across spark sessions. Is that supposed to work? Otherwise you'd have to get a new
query execution to bind to new sparksession which would only let you share logical plans.

> I am sending pr along with the latter.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message