flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Seth Wiesman <swies...@mediamath.com>
Subject state size in relation to cluster size and processing speed
Date Fri, 16 Dec 2016 20:01:19 GMT

I’ve noticed something peculiar about the relationship between state size and cluster size
and was wondering if anyone here knows of the reason. I am running a job with 1 hour tumbling
event time windows which have an allowed lateness of 7 days. When I run on a 20-node cluster
with FsState I can process approximately 1.5 days’ worth of data in an hour with the most
recent checkpoint being ~20gb.  Now if I run the same job with the same configurations on
a 40-node cluster I can process 2 days’ worth of data in 20 min (expected) but the state
size is only ~8gb. Because allowed lateness is 7 days no windows should be purged yet and
I would expect the larger cluster which has processed more data to have a larger state. Is
there some why a slower running job or a smaller cluster would require more state?

This is more of a curiosity than an issue. Thanks’ in advance for any insights you may have.

Seth Wiesman
View raw message