flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-1442) Archived Execution Graph consumes too much memory
Date Mon, 26 Jan 2015 22:09:34 GMT

    [ https://issues.apache.org/jira/browse/FLINK-1442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14292498#comment-14292498
] 

ASF GitHub Bot commented on FLINK-1442:
---------------------------------------

Github user StephanEwen commented on the pull request:

    https://github.com/apache/flink/pull/344#issuecomment-71548964
  
    Looks good so far. I see that you removed the LRU code. Was that on purpose?
    
    Leaving it in may be a good idea, because the soft references are cleared in arbitrary
order. It may make newer jobs disappear before older ones. Having the LRU in would mean things
behave as previously as long as the memory is sufficient, and the soft reference clearing
kicks in as a safety valve.


> Archived Execution Graph consumes too much memory
> -------------------------------------------------
>
>                 Key: FLINK-1442
>                 URL: https://issues.apache.org/jira/browse/FLINK-1442
>             Project: Flink
>          Issue Type: Bug
>          Components: JobManager
>    Affects Versions: 0.9
>            Reporter: Stephan Ewen
>            Assignee: Max Michels
>
> The JobManager archives the execution graphs, for analysis of jobs. The graphs may consume
a lot of memory.
> Especially the execution edges in all2all connection patterns are extremely many and
add up in memory consumption.
> The execution edges connect all parallel tasks. So for a all2all pattern between n and
m tasks, there are n*m edges. For parallelism of multiple 100 tasks, this can easily reach
100k objects and more, each with a set of metadata.
> I propose the following to solve that:
> 1.  Clear all execution edges from the graph (majority of the memory consumers) when
it is given to the archiver.
> 2. Have the map/list of the archived graphs behind a soft reference, to it will be removed
under memory pressure before the JVM crashes. That may remove graphs from the history early,
but is much preferable to the JVM crashing, in which case the graph is lost as well...
> 3. Long term: The graph should be archived somewhere else. Somthing like the History
server used by Hadoop and Hive would be a good idea.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message