spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tathagata Das (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (SPARK-21145) Restarted queries reuse same StateStoreProvider, causing multiple concurrent tasks to update same StateStore
Date Fri, 23 Jun 2017 07:44:00 GMT

     [ https://issues.apache.org/jira/browse/SPARK-21145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Tathagata Das resolved SPARK-21145.
-----------------------------------
       Resolution: Fixed
    Fix Version/s: 2.3.0

Issue resolved by pull request 18355
[https://github.com/apache/spark/pull/18355]

> Restarted queries reuse same StateStoreProvider, causing multiple concurrent tasks to
update same StateStore
> ------------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-21145
>                 URL: https://issues.apache.org/jira/browse/SPARK-21145
>             Project: Spark
>          Issue Type: Bug
>          Components: Structured Streaming
>    Affects Versions: 2.2.0
>            Reporter: Tathagata Das
>            Assignee: Tathagata Das
>             Fix For: 2.3.0
>
>
> StateStoreProvider instances are loaded on-demand in a executor when a query is started.
When a query is restarted, the loaded provider instance will get reused. Now, there is a non-trivial
chance, that the task of the previous query run is still running, while the tasks of the restarted
run has started. So for a stateful partition, there may be two concurrent tasks related to
the same stateful partition, and there for using the same provider instance. This can lead
to inconsistent results and possibly random failures, as state store implementations are not
designed to be thread-safe.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message