flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ebru <b20926...@cs.hacettepe.edu.tr>
Subject Re: Flink memory leak
Date Tue, 07 Nov 2017 13:35:51 GMT
Hi Ufuk,

We don’t explicitly define any state descriptor. We only use map and filters operator. We
thought that gc handle clearing the flink’s internal states. 
So how can we manage the memory if it is always increasing?

- Ebru
> On 7 Nov 2017, at 16:23, Ufuk Celebi <uce@apache.org> wrote:
> 
> Hey Ebru, the memory usage might be increasing as long as a job is running. This is expected
(also in the case of multiple running jobs). The screenshots are not helpful in that regard.
:-(
> 
> What kind of stateful operations are you using? Depending on your use case, you have
to manually call `clear()` on the state instance in order to release the managed state.
> 
> Best,
> 
> Ufuk
> 
> On Tue, Nov 7, 2017 at 12:43 PM, ebru <b20926247@cs.hacettepe.edu.tr <mailto:b20926247@cs.hacettepe.edu.tr>>
wrote:
> 
> 
>> Begin forwarded message:
>> 
>> From: ebru <b20926247@cs.hacettepe.edu.tr <mailto:b20926247@cs.hacettepe.edu.tr>>
>> Subject: Re: Flink memory leak
>> Date: 7 November 2017 at 14:09:17 GMT+3
>> To: Ufuk Celebi <uce@apache.org <mailto:uce@apache.org>>
>> 
>> Hi Ufuk,
>> 
>> There are there snapshots of htop output.
>> 1. snapshot is initial state.
>> 2. snapshot is after submitted one job.
>> 3. Snapshot is the output of the one job with 15000 EPS. And the memory usage is
always increasing over time.
>> 
>> 
>> 
>> 
>> <1.png><2.png><3.png>
>>> On 7 Nov 2017, at 13:34, Ufuk Celebi <uce@apache.org <mailto:uce@apache.org>>
wrote:
>>> 
>>> Hey Ebru,
>>> 
>>> let me pull in Aljoscha (CC'd) who might have an idea what's causing this.
>>> 
>>> Since multiple jobs are running, it will be hard to understand to
>>> which job the state descriptors from the heap snapshot belong to.
>>> - Is it possible to isolate the problem and reproduce the behaviour
>>> with only a single job?
>>> 
>>> – Ufuk
>>> 
>>> 
>>> On Tue, Nov 7, 2017 at 10:27 AM, ÇETİNKAYA EBRU ÇETİNKAYA EBRU
>>> <b20926247@cs.hacettepe.edu.tr <mailto:b20926247@cs.hacettepe.edu.tr>>
wrote:
>>>> Hi,
>>>> 
>>>> We are using Flink 1.3.1 in production, we have one job manager and 3 task
>>>> managers in standalone mode. Recently, we've noticed that we have memory
>>>> related problems. We use docker container to serve Flink cluster. We have
>>>> 300 slots and 20 jobs are running with parallelism of 10. Also the job count
>>>> may be change over time. Taskmanager memory usage always increases. After
>>>> job cancelation this memory usage doesn't decrease. We've tried to
>>>> investigate the problem and we've got the task manager jvm heap snapshot.
>>>> According to the jam heap analysis, possible memory leak was Flink list
>>>> state descriptor. But we are not sure that is the cause of our memory
>>>> problem. How can we solve the problem?
>> 
> 
> 


Mime
View raw message