flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Piotr Nowojski <pi...@ververica.com>
Subject Re: CoGroup SortMerger performance degradation from 1.6.4 - 1.9.1?
Date Wed, 27 Nov 2019 07:13:17 GMT
Hi,

> @yingjie   Do you have any idea how much memory will be stolen from OS when using mmap
for data reading?

I think this is bounded only by the size of the written data. Also it will not be “stolen
from OS”, as kernel is controlling the amount of pages residing currently in the RAM depending
on the current memory pressure (the same was as file cache).

Piotrek
  
> On 27 Nov 2019, at 04:50, bupt_ljy <bupt_ljy@163.com> wrote:
> 
> Hi,
> 
> I’ve met the exactly same problem recently and solved it in Piotr’s way. @zhijiang,
I didn’t see any oom error thrown by JVM (I’m not sure this can be thrown if yarn decides
to kill it in a mandatory way). According to our monitoring system, the overusage of memory
is from JVM directy memory.
> 
> The interesting part is that the old way works if I increase the -XX:MaxDirectMemorySize
to be around 3 GB (it’s around 2GB before). So I suspect we at least need to reserve one
#ByteBuffer’s size in #memoryMappedRegions for #MappedByteBuffer (which is 2 GB for large
files). Not sure I’m right about this.
> 
> @yingjie   Do you have any idea how much memory will be stolen from OS when using mmap
for data reading?
> 
> 
> Best,
> Jiayi Liao
> 
>                              Original Message                             
> Sender: yingjie<yjcloveyou@gmail.com>
> Recipient: user<user@flink.apache.org>
> Date: Tuesday, Nov 26, 2019 18:10
> Subject: Re: CoGroup SortMerger performance degradation from 1.6.4 - 1.9.1?
> 
> The new BlockingSubpartition implementation in 1.9 uses mmap for data reading
> by default which means it steals memory from OS. The mmaped region memory is
> managed by JVM, so there should be no OutOfMemory problem reported by JVM
> and the OS memory is also not exhausted, so there should be no kernal OOM.
> I think Piotr's suspicion is right, yarn tracked the memory used and killed
> the TM (the mmap region is also part of the process memory).
> 
> Giving a strict resource restriction to the container (larger than the yarn
> limit) which can avoid memory steal or using file instead of mmap as pointed
> out by Piotr can solve the problem.
> 
> I think Flink may need to restrict the amount of memory can be stolen.
> 
> 
> 
> --
> Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/ <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/>


Mime
View raw message