flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From NicoK <...@git.apache.org>
Subject [GitHub] flink pull request #4506: [FLINK-7400][cluster] fix off-heap limits set to c...
Date Wed, 09 Aug 2017 14:39:33 GMT
GitHub user NicoK opened a pull request:

    https://github.com/apache/flink/pull/4506

    [FLINK-7400][cluster] fix off-heap limits set to conservatively in cluster environments

    ## What is the purpose of the change
    
    Inside `ContaineredTaskManagerParameters`, since #3648, the `offHeapSize` is set to the
amount of memory Flink will use off-heap which will be set as the value for `-XX:MaxDirectMemorySize`
in various cases, e.g. YARN or Mesos. This does not account for any off-heap use by other
components than Flink, e.g. RocksDB, other libraries, or the JVM itself.
    
    Please note that this affects at least all batch programs with the following options set
(which do not make much sense for streaming):
    ```
    taskmanager.memory.off-heap=true
    taskmanager.memory.size=<any value>
    taskmanager.memory.preallocate=true
    ```
    If, instead, `taskmanager.memory.fraction` is used, programs may be safe due to https://issues.apache.org/jira/browse/FLINK-7401
but the actual additional buffer that we get from that may be too small, especially if RocksDB
or other libraries using off-heap memory are used.
    
    This PR adds the `cutoff` from the `containerized.heap-cutoff-ratio`/`containerized.heap-cutoff-min`
configuration parameters to `offHeapSize` as implied by the description of these two options.
    
    ## Brief change log
    
    - include the cut-off memory (removed from the container memory size for further calculations)
into the off-heap part
    - add a unit test verifying the bug fix in a YARN environment
    
    ## Verifying this change
    
    This change added tests and can be verified as follows:
    
    - added `YARNSessionCapacitySchedulerITCase#perJobYarnClusterOffHeap()` test that validates
that we have enough memory available and the bounds are not too strict
    
    ## Does this pull request potentially affect one of the following parts:
    
      - Dependencies (does it add or upgrade a dependency): (no)
      - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (no)
      - The serializers: (no)
      - The runtime per-record code paths (performance sensitive): (no)
      - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing,
Yarn/Mesos, ZooKeeper: (yes: memory calculations)
    
    ## Documentation
    
      - Does this pull request introduce a new feature? (no)
      - If yes, how is the feature documented? (JavaDocs)
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/NicoK/flink flink-7400

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/4506.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #4506
    
----
commit 60d40cde20686b4b1b2d15dc838b15ed0cd994cc
Author: Nico Kruber <nico@data-artisans.com>
Date:   2017-08-09T09:53:03Z

    [FLINK-7400][cluster] fix cut-off memory not used for off-heap reserve as intended
    
    + fix description of `containerized.heap-cutoff-ratio`

commit 4135a223288608444d324da333cfdd70117c796d
Author: Nico Kruber <nico@data-artisans.com>
Date:   2017-08-09T14:16:31Z

    [FLINK-7400][yarn] add an integration test for yarn container memory restrictions using
off-heap memory

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

Mime
View raw message