flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Metzger <rmetz...@apache.org>
Subject Re: Flink 1.1.3 OOME Permgen
Date Fri, 02 Dec 2016 14:21:04 GMT
Thank you for reporting the issue Konstantin.
I've filed a JIRA for the jackson issue:
https://issues.apache.org/jira/browse/FLINK-5233.
As I said in the JIRA, I propose to upgrade to Jackson 2.7.8, as this
version contains the fix for the issue, but its not a major jackson upgrade.

Any chance you could try to if 2.7.8 fixes the issue as well?


On Fri, Dec 2, 2016 at 11:12 AM, Fabian Hueske <fhueske@gmail.com> wrote:

> Hi Konstantin,
>
> Regarding 2): I've opened FLINK-5227 to update the documentation [1].
>
> Regarding the Row type: The Row type was introduced for flink-table and
> was later used by other modules. There is FLINK-5186 to move Row and all
> the related TypeInfo (+serializer and comparator) to flink-core [2]. That
> should solve your issue.
>
> Some of the connector modules which provide TableSource and TableSinks
> have dependencies on flink-table as well. I'll check that these are
> optional dependencies to avoid that we pull in Calcite through connectors
> for jobs that do not not need it.
>
> Thanks,
> Fabian
>
> [1] https://issues.apache.org/jira/browse/FLINK-5227
> [2] https://issues.apache.org/jira/browse/FLINK-5186
>
> 2016-11-30 17:51 GMT+01:00 Konstantin Knauf <konstantin.knauf@tngtech.com>
> :
>
>> Hi Stefan,
>>
>> unfortunately, I can not share any heap dumps with you. I was able to
>> resolve some of the issues my self today, the root causes were different
>> for different jobs.
>>
>> 1) Jackson 2.7.2 (which comes with Flink) has a known class loading
>> issue (see https://github.com/FasterXML/jackson-databind/issues/1363).
>> Shipping a shaded version of Jackson 2.8.4 with our user code helped. I
>> recommend upgrading Flink's Jackson version soon.
>>
>> 2) We have a dependency on the flink-table [1] , which ships with
>> Calcite including the Calcite JDBC Driver, which can not been collected
>> cause of the known problem with the java.sql.DriverManager. Putting the
>> flink-table in Flink's lib dir instead of shipping it with the user code
>> helps. You should update the documentation, because this will always
>> happen when using flink-table, I think. So I wonder, why this has not
>> come up before actually.
>>
>> 3) Unresolved: Some Threads in a custom source which are not proberly
>> shut down and keep references to the UserCodeClassLoader. I did not have
>> time to look into this issue so far.
>>
>> Cheers,
>>
>> Konstantin
>>
>> [1] Side note: We only need flink-table for the "Row" class used in the
>> JdbcOutputFormat, so it might make sense to move this class somewhere
>> else. Naturally, we also tried to exclude the "transitive" dependency on
>> org.apache.calcite until we noticed that calcite is packaged with
>> flink-table, so that you can not even exclude it. What is the reasons
>> for this?
>>
>>
>>
>>
>> On 30.11.2016 00:55, Stefan Richter wrote:
>> > Hi,
>> >
>> > could you somehow provide us a heap dump from a TM that run for a while
>> (ideally, shortly before an OOME)? This would greatly help us to figure out
>> if there is a classloader leak that causes the problem.
>> >
>> > Best,
>> > Stefan
>> >
>> >> Am 29.11.2016 um 18:39 schrieb Konstantin Knauf <
>> konstantin.knauf@tngtech.com>:
>> >>
>> >> Hi everyone,
>> >>
>> >> since upgrading to Flink 1.1.3 we observe frequent OOME Permgen
>> Taskmanager Failures. Monitoring the permgen size on one of the
>> Taskamanagers you can see that each Job (New Job and Restarts) adds a few
>> MB, which can not be collected. Eventually, the OOME happens. This happens
>> with all our Jobs, Streaming and Batch, on Yarn 2.4 as well as Stand-Alone.
>> >>
>> >> On Flink 1.0.2 this was not a problem, but I will investigate it
>> further.
>> >>
>> >> The assumption is that Flink is somehow using one of the classes,
>> which comes with our jar and by that prevents the gc of the whole class
>> loader. Our Jars do not include any flink dependencies though
>> (compileOnly), but of course many others.
>> >>
>> >> Any ideas anyone?
>> >>
>> >> Cheers and thank you,
>> >>
>> >> Konstantin
>> >>
>> >> sent from my phone. Plz excuse brevity and tpyos.
>> >> ---
>> >> Konstantin Knauf *konstantin.knauf@tngtech.com * +49-174-3413182
>> >> TNG Technology Consulting GmbH, Betastr. 13a, 85774 Unterföhring
>> >> Geschäftsführer: Henrik Klagges, Christoph Stock, Dr. Robert Dahlke
>> >
>> >
>>
>> --
>> Konstantin Knauf * konstantin.knauf@tngtech.com * +49-174-3413182
>> TNG Technology Consulting GmbH, Betastr. 13a, 85774 Unterföhring
>> Geschäftsführer: Henrik Klagges, Christoph Stock, Dr. Robert Dahlke
>> Sitz: Unterföhring * Amtsgericht München * HRB 135082
>>
>>
>

Mime
View raw message