flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Marchant, Hayden " <hayden.march...@citi.com>
Subject Very low-latency - is it possible?
Date Thu, 31 Aug 2017 12:50:55 GMT
We're about to get started on a 9-person-month PoC using Flink Streaming. Before we get started,
I am interested to know how low-latency I can expect for my end-to-end flow for a single event
(from source to sink). 

Here is a very high-level description of our Flink design: 
We need at least once semantics, and our main flow of application is parsing a message ( <
50 microseconds) from Kafka, and then doing a keyBy on the parsed event ( <1kb) and then
updating a very small user state in the KeyedStream, and then doing another keyBy and then
operator of that KeyedStream. Each of the operators is a very simple operation - very little
calculation and no I/O.

** Our requirement is to get close to 1ms (99%) or lower for end-to-end processing (timer
starts once we get message from Kafka). Is this at all realistic if are flow contains 2 aggregations?
 If so, what optimizations might we need to get there regarding cluster configuration (both
Flink and Hardware). Our throughput is possibly small enough (40,000 events per second) that
we could run on one node - which might eliminate some network latency. 

I did read in https://ci.apache.org/projects/flink/flink-docs-master/internals/stream_checkpointing.html
in Exactly Once vs At Least Once that a few milliseconds is considered super low-latency -
wondering if we can get lower.

Any advice or 'war stories' are very welcome.

Hayden Marchant

View raw message