spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thakrar, Jayesh" <jthak...@conversantmedia.com>
Subject Re: Datasource API V2 and checkpointing
Date Fri, 27 Apr 2018 15:53:08 GMT
Wondering if this issue is related to SPARK-23323?

Any pointers will be greatly appreciated….

Thanks,
Jayesh

From: "Thakrar, Jayesh" <jthakrar@conversantmedia.com>
Date: Monday, April 23, 2018 at 9:49 PM
To: "dev@spark.apache.org" <dev@spark.apache.org>
Subject: Datasource API V2 and checkpointing

I was wondering when checkpointing is enabled, who does the actual work?
The streaming datasource or the execution engine/driver?

I have written a small/trivial datasource that just generates strings.
After enabling checkpointing, I do see a folder being created under the checkpoint folder,
but there's nothing else in there.

Same question for write-ahead and recovery?
And on a restart from a failed streaming session - who should set the offsets?
The driver/Spark or the datasource?

Any pointers to design docs would also be greatly appreciated.

Thanks,
Jayesh

Mime
View raw message