flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vinay (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-7289) Memory allocation of RocksDB can be problematic in container environments
Date Wed, 02 Aug 2017 05:59:00 GMT

    [ https://issues.apache.org/jira/browse/FLINK-7289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16110363#comment-16110363
] 

Vinay commented on FLINK-7289:
------------------------------

Hi Stephan,

I am not saying that we should use the same approach of dropping the cache or introduce this
in Flink, that is surely not the correct approach. It's just that I found it easy to clean
the memory whenever I wanted to run the job second time after I canceled or killed it because
the TM was getting killed every time I run the job second time because the memory usage was
full (even though I was expecting YARN to clean the memory when the job is canceled or killed)

I am running the job on EMR with which Flink is already installed and I have not done any
extra configurations as well. May be it is a configuration issue which I am not aware of.
I will surely share the logs whenever I run the pipeline again on EMR.

> Memory allocation of RocksDB can be problematic in container environments
> -------------------------------------------------------------------------
>
>                 Key: FLINK-7289
>                 URL: https://issues.apache.org/jira/browse/FLINK-7289
>             Project: Flink
>          Issue Type: Improvement
>          Components: State Backends, Checkpointing
>    Affects Versions: 1.2.0, 1.3.0, 1.4.0
>            Reporter: Stefan Richter
>
> Flink's RocksDB based state backend allocates native memory. The amount of allocated
memory by RocksDB is not under the control of Flink or the JVM and can (theoretically) grow
without limits.
> In container environments, this can be problematic because the process can exceed the
memory budget of the container, and the process will get killed. Currently, there is no other
option than trusting RocksDB to be well behaved and to follow its memory configurations. However,
limiting RocksDB's memory usage is not as easy as setting a single limit parameter. The memory
limit is determined by an interplay of several configuration parameters, which is almost impossible
to get right for users. Even worse, multiple RocksDB instances can run inside the same process
and make reasoning about the configuration also dependent on the Flink job.
> Some information about the memory management in RocksDB can be found here:
> https://github.com/facebook/rocksdb/wiki/Memory-usage-in-RocksDB
> https://github.com/facebook/rocksdb/wiki/RocksDB-Tuning-Guide
> We should try to figure out ways to help users in one or more of the following ways:
> - Some way to autotune or calculate the RocksDB configuration.
> - Conservative default values.
> - Additional documentation.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message