flink-user-zh mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zhijiang" <wangzhijiang...@aliyun.com.INVALID>
Subject Re: Rewind offset to a previous position and ensure certainty.
Date Thu, 26 Dec 2019 03:28:34 GMT
If I understood correctly, different partitions of Kafka would be emitted by different source
tasks with different watermark progress.  And the Flink framework would align the different
watermarks to only output the smallest watermark among them, so the events from slow partitions
would not be discarded because the downstream operator would only see the watermark based
on the slow partition atm. You can refer to [1] for some details.

As for rewinding the offset of partition position, I guess it only happens in failure recovery
case or you manually restart the job. Anyway all the topology tasks would be restarted and
previous received watermarks are cleared.
So it would also not discard the events in this case.  Unless you can only rewind some source
task to previous positions and keep other downstream tasks still running, it might have the
issues you concern. But Flink can not support such operation/function atm. :) 

[1] https://ci.apache.org/projects/flink/flink-docs-release-1.9/dev/event_timestamps_watermarks.html

Best,
Zhijiang
------------------------------------------------------------------
From:邢瑞斌 <xingrobin@gmail.com>
Send Time:2019 Dec. 25 (Wed.) 20:27
To:user-zh <user-zh@flink.apache.org>; user <user@flink.apache.org>
Subject:Rewind offset to a previous position and ensure certainty.

Hi,

I'm trying to use Kafka as an event store and I want to create several partitions to improve
read/write throughput. Occasionally I need to rewind offset to a previous position for recomputing.
Since order isn't guaranteed among partitions in Kafka, does this mean that Flink won't produce
the same results as before when rewind even if it uses event time? For example, consumer for
a partition progresses extremely fast and raises watermark, so events from other partitions
are discarded. Is there any ways to prevent this from happening?

Thanks in advance!

Ruibin

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message