flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bhumika Bayani (JIRA)" <j...@apache.org>
Subject [jira] [Created] (FLINK-8622) flink-mesos: High memory usage of scheduler + job manager. GC never kicks in.
Date Fri, 09 Feb 2018 10:06:00 GMT
Bhumika Bayani created FLINK-8622:

             Summary: flink-mesos: High memory usage of scheduler + job manager. GC never
kicks in.
                 Key: FLINK-8622
                 URL: https://issues.apache.org/jira/browse/FLINK-8622
             Project: Flink
          Issue Type: Bug
    Affects Versions: 1.3.2, 1.4.0
            Reporter: Bhumika Bayani

We are deploying a 1 job manager + 6 taskmanager flink cluster on mesos.

We have observed that the memory usage for 'jobmanager' is high. In spite of allocating more
and more memory resources to it, it hits the limit within minutes.

We had started with 1.5 GB RAM and 1 GB heap. Currently we have allocated 4 GB RAM, 3 GB heap
to jobmanager cum scheduler. We tried allocating 8GB RAM and lesser heap (i.e. same, 3GB)
too. In that case also, memory graph was identical.

As per the graph below, the scheduler almost always runs with maximum memory resources.



Throughout the run of the scheduler, we do not see memory usage going down unless it is killed
due to OOM. So inferring, garbage collection is never happening.

We have tried using both flink versions 1.4 and 1.3 but could see same issue on both versions.


Is there any way we can find out where and how memory is being used? 

Are there any flink config options for jobmanager or jvm parameters which can help us restrict
the memory usage, force garbage collection, and prevent it from crash? 

Please let us know if there any resource recommendations from Flink for running Flink on
mesos at scale.


This message was sent by Atlassian JIRA

View raw message