flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lasse Nedergaard <lassenedergaardfl...@gmail.com>
Subject Re: Latency tracking together with broadcast state can cause job failure
Date Wed, 01 Apr 2020 20:03:27 GMT
Hi

I have attached a simple project with a test that reproduce the problem. The normal fault
is a mixed string but you can also EOF exception. 
Please let me know if you have any questions to the solution. 

Med venlig hilsen / Best regards
Lasse Nedergaard


> Den 1. apr. 2020 kl. 09.15 skrev Yun Tang <myasuka@live.com>:
> 
> 
> Hi Lasse
> 
> Never meet this problem before, but can you share some exception stack trace so that
we could take a look. The simple project to reproduce is also a good choice.
> 
> Best
> Yun Tang
> From: Lasse Nedergaard <lassenedergaardflink@gmail.com>
> Sent: Tuesday, March 31, 2020 19:10
> To: user <user@flink.apache.org>
> Subject: Latency tracking together with broadcast state can cause job failure
>  
> Hi
> 
> We have in both Flink 1.9.2 and 1.10 struggled with random deserialze and Index out of
range exception in one of our job. We also get out of memory exceptions. 
> We have now identified it as a latency tracking together with broadcast state Causing
the problem. When we do integration testing locally we don’t see any problem it’s only
fails running on the cluster. 
> We have concluded that latency tracking package send over broadcast cause the data stream
to be corrupted and causing the exceptions. 
> We work on preparing a simple project on github to reproduce the problem so the underlying
problem can be solved. 
> 
> Anyone else have seen these kind of problems?
> 
> Med venlig hilsen / Best regards
> Lasse Nedergaard
> 

Mime
View raw message