flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ufuk Celebi <u.cel...@fu-berlin.de>
Subject Task manager memory configuration with intermediate results
Date Mon, 02 Feb 2015 17:45:31 GMT
Currently, the memory configuration of a task manager encompasses two things:

1) NETWORK buffers: Fixed amount of memory for the network buffer pool (default: 2048 buffers,
each 32 KB => 64 MB)

2) OPERATOR buffers: A configurable fraction of the available heap memory (default 0.7) for
the memory manager used by internal operators like sort or hash

With the recently added supported for intermediate results, intermediate results live in the
network stack and use buffers from the network buffer pool. Currently, our memory management
is not a problem, because we only support ephemeral intermediate results, which are directly
consumed in a pipelined fashion (i.e. the buffer pools are short-lived).

But with the upcoming support for persistent intermediate results (https://github.com/apache/flink/pull/356)
and fine-grained fault tolerance, we need to rethink how we configure/divide the available
memory between the network stack and the operators as the network buffer pools will live longer.

I would suggest the following:

1) Remove the configuration for network buffers

2) Keep the fraction configuration, but internally divide it between network stack and operators
in a 50:50 fashion (in the future with dynamic memory management, we will not even need to
statically divide the memory between network stack and operators).

What do you think? Is this reasonable? Should the division be configurable as well?

– Ufuk
View raw message