ignite-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dmitriy Setrakyan (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (IGNITE-2419) Ignite on YARN do not handle memory overhead
Date Wed, 20 Jan 2016 19:43:40 GMT

     [ https://issues.apache.org/jira/browse/IGNITE-2419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Dmitriy Setrakyan updated IGNITE-2419:
    Fix Version/s: 1.6

> Ignite on YARN do not handle memory overhead
> --------------------------------------------
>                 Key: IGNITE-2419
>                 URL: https://issues.apache.org/jira/browse/IGNITE-2419
>             Project: Ignite
>          Issue Type: Bug
>          Components: hadoop
>         Environment: hadoop cluster with YARN
>            Reporter: Edouard Chevalier
>            Assignee: Vladimir Ozerov
>            Priority: Critical
>             Fix For: 1.6
> When deploying ignite nodes with YARN, JVM are launched with a defined amount of memory
(property IGNITE_MEMORY_PER_NODE transposed to the "-Xmx" jvm property) and YARN is told to
provide container that would require exactly that amount of memory. But YARN monitors the
memory of the overall process, not the heap: JVM can easily requires more memory than the
heap (VM and/or native overheads, threads overhead, and in the case of ignite, possibly offheap
data structures). If tasks require all of the heap, the process memory would be more far more
than the heap memory. The YARN then would consider that node should be killed (and kills it
!) and create another one. I have a scenario where tasks requires all of JVM memory and YARN
is continously allocating/deallocating containers. Global task never finishes.
> My proposal is to implement a property IGNITE_OVERHEADMEMORY_PER_NODE like property spark.yarn.executor.memoryOverhead
in  spark (see : https://spark.apache.org/docs/latest/running-on-yarn.html#configuration )
. I can implement it and create a pull request in github.

This message was sent by Atlassian JIRA

View raw message