spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gideon <gideon.cal...@volcanodata.com>
Subject Re: Spark Streaming Checkpoint help failed application
Date Wed, 11 Nov 2015 08:54:54 GMT
Hi,

I'm no expert but

Short answer: yes, after restarting your application will reread the failed
messages

Longer answer: it seems like you're mixing several things together
Let me try and explain:
- WAL is used to prevent your application from losing data by making the
executor first write the data it receives from Kafka into WAL and only then
updating the Kafka high level consumer (what the receivers approach is
using) that it actually received the data (making it an at-least once)
- Checkpoints are a mechanism that helps your *driver* recover from failures
by saving driver information into HDFS (or S3 or whatever)

Now, the reason I explained these is this: you asked "... one bug caused the
streaming application to fail and exit" - so the failure you're trying to
solve is in the driver. When you restart your application your driver will
go and fetch the information it last saved in the checkpoint (saved into
HDFS) and order to new executors (since the previous driver died so did the
executors) to continue consuming data. Since your executors are using the
receivers approach (as opposed to the directkafkastream) with WAL what will
happen is that when they (the executors) get started they will first execute
what was saved in the WAL and then read from the latest offsets saved in
Kafka (Zookeeper) which in your case means you won't lose data (the
executors first save the data to WAL then advance their offsets on Kafka)

If you decide to go for the  direct approach
<https://spark.apache.org/docs/latest/streaming-kafka-integration.html#approach-2-direct-approach-no-receivers>
 
then your driver will be the one (and only one) managing the offsets for
Kafka which means that some of the data the driver will save in the
checkpoint will be the Kafka offsets

I hope this helps :)




--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Streaming-Checkpoint-help-failed-application-tp25347p25357.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Mime
View raw message