flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stephan Ewen <se...@apache.org>
Subject Re: taskmanager memory leak
Date Thu, 21 Jul 2016 08:54:47 GMT
Hi!

There is a memory debugging logger, you can activate it like that:
https://ci.apache.org/projects/flink/flink-docs-master/setup/config.html#memory-and-performance-debugging

It will print which parts of the memory are growing.

What you can also try is to deactivate checkpointing for one run and see if
that solves it. If yes, then I suspect there is a memory leak in the s3
library (are you using s3, s3a, or s3n?).

Can you also check what libraries you are using? We have seen cases of
memory leaks in the libraries people used.

Greetings,
Stephan



On Thu, Jul 21, 2016 at 5:13 AM, 김동일 <kim.same@gmail.com> wrote:

> hi. stephan.
>
> - Did you submit any job to the cluster, or is the memory just growing
> even on an idle TaskManager?
>
> I have some stream job.
>
> - If you are running a job, do you use the RocksDB state backend, of the
> FileSystem state backend?
>
> file state backend. i use s3.
>
> - Does it grow infinitely, or simply up a certain point and then goes down
> again?
>
> I think it infinitely. kernel kills the process , oom.
>
>
>
> On Thu, Jul 21, 2016 at 3:52 AM Stephan Ewen <sewen@apache.org> wrote:
>
>> Hi!
>>
>> In order to answer this, we need a bit more information. Here are some
>> followup questions:
>>
>>   - Did you submit any job to the cluster, or is the memory just growing
>> even on an idle TaskManager?
>>   - If you are running a job, do you use the RocksDB state backend, of
>> the FileSystem state backend?
>>   - Does it grow infinitely, or simply up a certain point and then goes
>> down again?
>>
>> Greetings,
>> Stephan
>>
>>
>> On Wed, Jul 20, 2016 at 5:58 PM, 김동일 <kim.same@gmail.com> wrote:
>>
>>> oh. my flink version is 1.0.3.
>>>
>>>
>>> ---------- Forwarded message ----------
>>> From: 김동일 <kim.same@gmail.com>
>>> Date: Thu, Jul 21, 2016 at 12:52 AM
>>> Subject: taskmanager memory leak
>>> To: user@flink.apache.org
>>>
>>>
>>> I've set up cluster(stand alone).
>>> Taskmanager consumes memory over the Xmx property and it grows up
>>> continuously.
>>> I saw this link(
>>> http://mail-archives.apache.org/mod_mbox/flink-dev/201606.mbox/%3CCAK2vtervsw4muBOc4SWix0mR6Y9biJznjuYpF6_f9f0g9-_6LA@mail.gmail.com%3E
>>> ).
>>> So i set the taskmanager.memory.preallocation value to true but that is
>>> not solution.
>>>
>>> my java version is
>>>
>>> java version "1.8.0_20"
>>> Java(TM) SE Runtime Environment (build 1.8.0_20-b26)
>>> Java HotSpot(TM) 64-Bit Server VM (build 25.20-b23, mixed mode)
>>>
>>> and my flink-conf.yaml
>>> <http://mail-archives.apache.org/mod_mbox/flink-dev/201606.mbox/%3CCAK2vtervsw4muBOc4SWix0mR6Y9biJznjuYpF6_f9f0g9-_6LA@mail.gmail.com%3E>
>>> <http://mail-archives.apache.org/mod_mbox/flink-dev/201606.mbox/%3CCAK2vtervsw4muBOc4SWix0mR6Y9biJznjuYpF6_f9f0g9-_6LA@mail.gmail.com%3E>
>>> <http://mail-archives.apache.org/mod_mbox/flink-dev/201606.mbox/%3CCAK2vtervsw4muBOc4SWix0mR6Y9biJznjuYpF6_f9f0g9-_6LA@mail.gmail.com%3E>
>>> <http://mail-archives.apache.org/mod_mbox/flink-dev/201606.mbox/%3CCAK2vtervsw4muBOc4SWix0mR6Y9biJznjuYpF6_f9f0g9-_6LA@mail.gmail.com%3E>
>>> <http://mail-archives.apache.org/mod_mbox/flink-dev/201606.mbox/%3CCAK2vtervsw4muBOc4SWix0mR6Y9biJznjuYpF6_f9f0g9-_6LA@mail.gmail.com%3E>
>>> <http://mail-archives.apache.org/mod_mbox/flink-dev/201606.mbox/%3CCAK2vtervsw4muBOc4SWix0mR6Y9biJznjuYpF6_f9f0g9-_6LA@mail.gmail.com%3E>
>>> <http://mail-archives.apache.org/mod_mbox/flink-dev/201606.mbox/%3CCAK2vtervsw4muBOc4SWix0mR6Y9biJznjuYpF6_f9f0g9-_6LA@mail.gmail.com%3E>
>>> <http://mail-archives.apache.org/mod_mbox/flink-dev/201606.mbox/%3CCAK2vtervsw4muBOc4SWix0mR6Y9biJznjuYpF6_f9f0g9-_6LA@mail.gmail.com%3E>
>>> <http://mail-archives.apache.org/mod_mbox/flink-dev/201606.mbox/%3CCAK2vtervsw4muBOc4SWix0mR6Y9biJznjuYpF6_f9f0g9-_6LA@mail.gmail.com%3E>
>>> <http://mail-archives.apache.org/mod_mbox/flink-dev/201606.mbox/%3CCAK2vtervsw4muBOc4SWix0mR6Y9biJznjuYpF6_f9f0g9-_6LA@mail.gmail.com%3E>
>>> <http://mail-archives.apache.org/mod_mbox/flink-dev/201606.mbox/%3CCAK2vtervsw4muBOc4SWix0mR6Y9biJznjuYpF6_f9f0g9-_6LA@mail.gmail.com%3E>
>>> <http://mail-archives.apache.org/mod_mbox/flink-dev/201606.mbox/%3CCAK2vtervsw4muBOc4SWix0mR6Y9biJznjuYpF6_f9f0g9-_6LA@mail.gmail.com%3E>
>>> <http://mail-archives.apache.org/mod_mbox/flink-dev/201606.mbox/%3CCAK2vtervsw4muBOc4SWix0mR6Y9biJznjuYpF6_f9f0g9-_6LA@mail.gmail.com%3E>
>>> <http://mail-archives.apache.org/mod_mbox/flink-dev/201606.mbox/%3CCAK2vtervsw4muBOc4SWix0mR6Y9biJznjuYpF6_f9f0g9-_6LA@mail.gmail.com%3E>
>>> <http://mail-archives.apache.org/mod_mbox/flink-dev/201606.mbox/%3CCAK2vtervsw4muBOc4SWix0mR6Y9biJznjuYpF6_f9f0g9-_6LA@mail.gmail.com%3E>
>>> <http://mail-archives.apache.org/mod_mbox/flink-dev/201606.mbox/%3CCAK2vtervsw4muBOc4SWix0mR6Y9biJznjuYpF6_f9f0g9-_6LA@mail.gmail.com%3E>
>>> <http://mail-archives.apache.org/mod_mbox/flink-dev/201606.mbox/%3CCAK2vtervsw4muBOc4SWix0mR6Y9biJznjuYpF6_f9f0g9-_6LA@mail.gmail.com%3E>
>>>
>>> env.java.home: /usr/java/default
>>> jobmanager.rpc.address: internal.stream01.denma.ggportal.net
>>> jobmanager.rpc.port: 6123
>>> jobmanager.heap.mb: 2048
>>> taskmanager.heap.mb: 8192
>>> taskmanager.memory.off-heap: true
>>> taskmanager.numberOfTaskSlots: 4
>>> taskmanager.memory.preallocate: false
>>> parallelism.default: 2
>>> jobmanager.web.port: 8081
>>> jobmanager.web.submit.enable: true
>>> state.backend: filesystem
>>> state.backend.fs.checkpointdir: s3a://denma.live/flink/datum/checkpoints
>>> taskmanager.network.numberOfBuffers: 8192
>>> taskmanager.tmp.dirs: /opt/flink/var/tmp
>>> fs.hdfs.hadoopconf: /opt/flink/conf/
>>> recovery.mode: zookeeper
>>> recovery.zookeeper.quorum: ....
>>> recovery.zookeeper.storageDir: s3a://denma.live/flink/datum/recovery
>>> recovery.jobmanager.port: 50000-50100
>>> recovery.zookeeper.path.root: /flink
>>> blob.server.port: 50100-50200
>>> blob.storage.directory: /opt/flink/var/tmp/flink-blob
>>> taskmanager.rpc.port: 6122
>>> taskmanager.data.port: 6121
>>>
>>>
>>> i need help. what shall i do?
>>> thx in advance.
>>>
>>>
>>>
>>> --
>>> <A HREF="http://www.kiva.org" TARGET="_top">
>>> <IMG SRC="http://www.kiva.org/images/bannerlong.png" WIDTH="460"
>>> HEIGHT="60" ALT="Kiva - loans that change lives" BORDER="0"
>>> ALIGN="BOTTOM"></A>
>>>
>>
>>

Mime
View raw message