flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stefan Richter <s.rich...@data-artisans.com>
Subject Re: Flink 1.1.3 OOME Permgen
Date Tue, 29 Nov 2016 23:55:26 GMT
Hi,

could you somehow provide us a heap dump from a TM that run for a while (ideally, shortly
before an OOME)? This would greatly help us to figure out if there is a classloader leak that
causes the problem.

Best,
Stefan

> Am 29.11.2016 um 18:39 schrieb Konstantin Knauf <konstantin.knauf@tngtech.com>:
> 
> Hi everyone, 
> 
> since upgrading to Flink 1.1.3 we observe frequent OOME Permgen Taskmanager Failures.
Monitoring the permgen size on one of the Taskamanagers you can see that each Job (New Job
and Restarts) adds a few MB, which can not be collected. Eventually, the OOME happens. This
happens with all our Jobs, Streaming and Batch, on Yarn 2.4 as well as Stand-Alone. 
> 
> On Flink 1.0.2 this was not a problem, but I will investigate it further.
> 
> The assumption is that Flink is somehow using one of the classes, which comes with our
jar and by that prevents the gc of the whole class loader. Our Jars do not include any flink
dependencies though (compileOnly), but of course many others.
> 
> Any ideas anyone? 
> 
> Cheers and thank you, 
> 
> Konstantin 
> 
> sent from my phone. Plz excuse brevity and tpyos.
> ---
> Konstantin Knauf *konstantin.knauf@tngtech.com * +49-174-3413182
> TNG Technology Consulting GmbH, Betastr. 13a, 85774 Unterföhring
> Geschäftsführer: Henrik Klagges, Christoph Stock, Dr. Robert Dahlke


Mime
View raw message