flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jörn Franke <jornfra...@gmail.com>
Subject Re: Memory Issue
Date Mon, 21 Aug 2017 06:14:30 GMT
One would need to look at your code and possible on some heap statistics. Maybe something wrong
happens when you cache them (do you use a 3rd party library or your own implementation?).
Do you use a stable version of your protobuf library (not necessarily the most recent). You
also may want to look at buffers to avoid creating objects (bytebuffer, stringbuffer etc).

Probably you are creating a lot of objects due to conversion into PoJo. You could increase
the heap for the Java objects of the young generation.
You can also switch to the G1-Garbage collector (if Jdk 8) or at least the parallel one.
Generally you should avoid creating PoJo/objects as much as possible in a long running Streaming

> On 21. Aug 2017, at 05:29, Govindarajan Srinivasaraghavan <govindraghvan@gmail.com>
> Hi,
> I have a pipeline running on flink which ingests around 6k messages per second. Each
message is around 1kb and it passes through various stages like filter, 5 sec tumbling window
per key etc.. and finally flatmap to computation before sending it to kafka sink. The data
is first ingested as protocol buffers and then in subsequent operators they are converted
into POJO's.
> There are lots objects created inside the user functions and some of them are cached
as well. I have been running this pipeline on 48 task slots across 3 task manages with each
one allocated with 22GB memory.
> The issue I'm having is within a period of 10 hours, almost 19k young generation GC have
been run which is roughly every 2 seconds and GC time taken value is more than 2 million.
I have also enabled object reuse. Any suggestions on how this issue could be resolved? Thanks.
> Regards,
> Govind

View raw message