spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "guifeng (JIRA)" <j...@apache.org>
Subject [jira] [Issue Comment Deleted] (SPARK-19645) structured streaming job restart bug
Date Fri, 17 Feb 2017 10:37:42 GMT

     [ https://issues.apache.org/jira/browse/SPARK-19645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

guifeng updated SPARK-19645:
----------------------------
    Comment: was deleted

(was: aBut, spark mater don't set rename options that support overwrite, so I think rename
will failed also. After use hadoop 2.6+ remain failed.)

> structured streaming job restart bug
> ------------------------------------
>
>                 Key: SPARK-19645
>                 URL: https://issues.apache.org/jira/browse/SPARK-19645
>             Project: Spark
>          Issue Type: Bug
>          Components: Structured Streaming
>    Affects Versions: 2.1.0
>            Reporter: guifeng
>            Priority: Critical
>
>     We are trying to use Structured Streaming in product, however currently there exists
a bug refer to the process of streaming job restart. 
>       The following is  the concrete error message:  
> {quote}
>    Caused by: java.lang.IllegalStateException: Error committing version 2 into HDFSStateStore[id
= (op=0, part=136), dir = /tmp/Pipeline_112346-continueagg-bxaxs/state/0/136]
> 	at org.apache.spark.sql.execution.streaming.state.HDFSBackedStateStoreProvider$HDFSBackedStateStore.commit(HDFSBackedStateStoreProvider.scala:162)
> 	at org.apache.spark.sql.execution.streaming.StateStoreSaveExec$$anonfun$doExecute$3.apply(StatefulAggregate.scala:173)
> 	at org.apache.spark.sql.execution.streaming.StateStoreSaveExec$$anonfun$doExecute$3.apply(StatefulAggregate.scala:123)
> 	at org.apache.spark.sql.execution.streaming.state.StateStoreRDD.compute(StateStoreRDD.scala:64)
> 	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
> 	at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
> 	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
> 	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
> 	at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
> 	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
> 	at org.apache.spark.scheduler.Task.run(Task.scala:99)
> 	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:282)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> 	at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.IOException: Failed to rename /tmp/Pipeline_112346-continueagg-bxaxs/state/0/136/temp--5345709896617324284
to /tmp/Pipeline_112346-continueagg-bxaxs/state/0/136/2.delta
> 	at org.apache.spark.sql.execution.streaming.state.HDFSBackedStateStoreProvider.org$apache$spark$sql$execution$streaming$state$HDFSBackedStateStoreProvider$$commitUpdates(HDFSBackedStateStoreProvider.scala:259)
> 	at org.apache.spark.sql.execution.streaming.state.HDFSBackedStateStoreProvider$HDFSBackedStateStore.commit(HDFSBackedStateStoreProvider.scala:156)
> 	... 14 more
> {quote}
>      The bug can be easily reproduce when restart previous streaming job, and the main
reason is that when restart streaming  job spark will recompute WAL offsets and generate the
same hdfs delta file(latest delta file generated before restart and named "currentBatchId.delta")
. In my opinion, this is a bug. If you guy consider that  this is a bug also,  I can fix it.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message