spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Apache Spark (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-19798) Query returns stale results when tables are modified on other sessions
Date Sat, 17 Nov 2018 23:41:00 GMT

    [ https://issues.apache.org/jira/browse/SPARK-19798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16690731#comment-16690731
] 

Apache Spark commented on SPARK-19798:
--------------------------------------

User 'gbloisi' has created a pull request for this issue:
https://github.com/apache/spark/pull/23074

> Query returns stale results when tables are modified on other sessions
> ----------------------------------------------------------------------
>
>                 Key: SPARK-19798
>                 URL: https://issues.apache.org/jira/browse/SPARK-19798
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.2.0
>            Reporter: Giambattista
>            Priority: Major
>
> I observed the problem on master branch with thrift server in multisession mode (default),
but I was able to replicate also with spark-shell as well (see below the sequence for replicating).
> I observed cases where changes made in a session (table insert, table renaming) are not
visible to other derived sessions (created with session.newSession).
> The problem seems due to the fact that each session has its own tableRelationCache and
it does not get refreshed.
> IMO tableRelationCache should be shared in sharedState, maybe in the cacheManager so
that refresh of caches for data that is not session-specific such as temporary tables gets
centralized.  
> --- Spark shell script
> val spark2 = spark.newSession
> spark.sql("CREATE TABLE test (a int) using parquet")
> spark2.sql("select * from test").show // OK returns empty
> spark.sql("select * from test").show // OK returns empty
> spark.sql("insert into TABLE test values 1,2,3")
> spark2.sql("select * from test").show // ERROR returns empty
> spark.sql("select * from test").show // OK returns 3,2,1
> spark.sql("create table test2 (a int) using parquet")
> spark.sql("insert into TABLE test2 values 4,5,6")
> spark2.sql("select * from test2").show // OK returns 6,4,5
> spark.sql("select * from test2").show // OK returns 6,4,5
> spark.sql("alter table test rename to test3")
> spark.sql("alter table test2 rename to test")
> spark.sql("alter table test3 rename to test2")
> spark2.sql("select * from test").show // ERROR returns empty
> spark.sql("select * from test").show // OK returns 6,4,5
> spark2.sql("select * from test2").show // ERROR throws java.io.FileNotFoundException
> spark.sql("select * from test2").show // OK returns 3,1,2



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message