flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexey Trenikhun <yen...@msn.com>
Subject async io parallelism
Date Sat, 22 Feb 2020 01:21:56 GMT
Hello,
Let's say, my elements are simple key-value pairs, elements are coming from Kafka, where they
were partitioned by "key", then I do processing using KeyedProcessFunction (keyed by same
"key"), then I enrich elements using ordered RichAsyncFunction, then output to another KeyedProcessFunction
(keyed by same "key") and then write to Kafka topic, again partitioned by same "key", something
like this:

FlinkKafkaConsumer -> keyBy("key") -> Intake(KeyedProcessFunction) -> AsyncDataStream.orderedWait()
-> keyBy("key")->Output(KeyedProcessFunction)->FlinkKafkaProducer

Will it preserve order of events with same "key"?

  *   Will Output function receive elements with same "key" in same order as they were originally
in Kafka?
  *   Will FlinkKafkaProducer writes elements with same "key" in same order as they were originally
in Kafka?
  *   Does it depend on parallelism of async IO? Documentation says "the stream order is preserved",
but if there are multiple parallel instances of async function, does it mean order relative
to each single instance? Or total stream order?

Thanks,
Alexey

Mime
View raw message