flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aljoscha Krettek <aljos...@apache.org>
Subject Re: Performance impact of many open windows at the same time
Date Mon, 25 May 2020 10:28:31 GMT
Hi,

I don't think this will immediately degrade performance. State is 
essentially stored in a HashMap (for the FileStateBackend) or RocksDB 
(for the RocksDB backend). If these data structures don't degrade with 
size then your performance also shouldn't degrade.

There are of course some effects that would come into play. For example 
if the data grows so much that it negatively affects GC this would of 
course have an effect, but that doesn't come from the data structures as 
they are.

I hope that helps!

Best,
Aljoscha

On 22.05.20 06:23, Tzu-Li (Gordon) Tai wrote:
> Hi Joe,
> 
> The main effect this should have is more state to be kept until the windows
> can be fired (and state purged).
> This would of course increase the time it takes to checkpoint the operator.
> 
> I'm not sure if there will be significant runtime per-record impact caused
> by how windows are bookkeeped in data structures in the WindowOperator,
> maybe Aljoscha (cc'ed) can chime in here for anything.
> If it is certain that these windows will never fire (until far into the
> future) because the event-timestamps are in the first place corrupted, then
> it might make sense to have a way to drop windows based on some criteria.
> I'm not sure if that is supported in any way without triggers (since you
> mentioned that those windows might not receive any data), again Aljoscha
> might be able to provide more info here.
> 
> Cheers,
> Gordon
> 
> On Thu, May 21, 2020 at 7:02 PM Joe Malt <jmalt@yelp.com> wrote:
> 
>> Hi all,
>>
>> I'm looking into what happens when messages are ingested with timestamps
>> far into the future (e.g. due to corruption or a wrong clock at the sender).
>>
>> I'm aware of the effect on watermarking, but another thing I'm concerned
>> about is the performance impact of the extra windows this will create.
>>
>> If a Flink operator has many (perhaps hundreds or thousands) of windows
>> open but not receiving any data (and never firing), will this degrade
>> performance?
>>
>> Thanks,
>> Joe
>>
> 


Mime
View raw message