flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stefan Richter <s.rich...@data-artisans.com>
Subject Re: R/W traffic estimation between Flink and Zookeeper
Date Thu, 16 Nov 2017 09:59:30 GMT
Hi,

I think Zookeeper is only used as a meta data store in HA mode. Interactions with ZK are not
part of the per-record stream processing code paths of Flink. Things that are written to ZK
can (also depending on your job) include e.g. the job graph, Kafka offsets, or the meta data
about available checkpoints to recover from. Some of those interactions happen only once per
job, others happen periodically. In the big picture, interactions with ZK happen rather rarely,
but of course this also depends on configuration parameters like your checkpointing interval.
For a typical job, I would estimate that ZK interactions occur less than once per second.
As for typical message sizes, if would estimate something between a few bytes or kilobytes
for most messages and somewhere in the low two-digit megabytes as a typical max size.

Best,
Stefan

> Am 15.11.2017 um 18:41 schrieb Hao Sun <hasun@zendesk.com>:
> 
> Thanks Piotr, does Flink read/write to zookeeper every time it process a record?
> I thought only JM uses ZK to keep some meta level data, not sure why `it depends on many
things like state backend used, state size, complexity of your application, size of the records,
number of machines, their hardware and the network.`
> 
> On Thu, Oct 12, 2017 at 1:35 AM Piotr Nowojski <piotr@data-artisans.com <mailto:piotr@data-artisans.com>>
wrote:
> Hi,
> 
> Are you asking how to measure records/s or is it possible to achieve it? To measure it
you can check numRecordsInPerSecond metric.
> 
> As far if 1000 records/s is possible, it depends on many things like state backend used,
state size, complexity of your application, size of the records, number of machines, their
hardware and the network. In the very simplest cases it is possible to achieve millions of
records per second per machine. It would be best to try it out in your particular use case
on some small scale.
> 
> Piotrek
> 
> > On 11 Oct 2017, at 19:58, Hao Sun <hasun@zendesk.com <mailto:hasun@zendesk.com>>
wrote:
> >
> > Hi Is there a way to estimate read/write traffic between flink and zk?
> > I am looking for something like 1000 reads/sec or 1000 writes/sec. And the size
of the message.
> >
> > Thanks
> 


Mime
View raw message