flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From CPC <acha...@gmail.com>
Subject Re: offheap memory allocation and memory leak bug
Date Sun, 19 Jun 2016 16:16:45 GMT
Hi,

I think i found some information regarding this behavior.  In jvm it is
almost imposible to free allocated memory via ByteBuffer.allocateDirect.
There is no explicit way to say jvm "free this direct bytebuffer". In some
forums they said you can free memory with below method:

> def releaseBuffers(buffers:List[ByteBuffer]):List[ByteBuffer] = {
>
>     if(!buffers.isEmpty){
>
>         val cleanerMethod = buffers.head.getClass.getMethod("cleaner")
>
>         cleanerMethod.setAccessible(true)
>
>         buffers.foreach{buffer=>
>
>             val cleaner = cleanerMethod.invoke(buffer)
>
>             val cleanMethod = cleaner.getClass().getMethod("clean")
>
>             cleanMethod.setAccessible(true)
>
>             cleanMethod.invoke(cleaner)
>
>         }
>
>     }
>
>     List.empty[ByteBuffer]
>
> }
>
>
but since cleaner method is an internal method ,above  is not recommended
and not working in every jvm and java 9 does not support it also. I also
made some tests with above method and behavior is not predictable. If
memory allocated by some other thread and that thread exit then it release
memory. Actually GC controls directMemory buffers. If there is no gc
activity and memory is allocated and then dereferenced by different threads
memory usage goes beyond intended and machine goes to swap then os kills
taskmanager. In my tests i saw that behaviour:

Suppose that thread A allocated 8gb memory exit and there is no reference
to allocated memory
than thread B allocated 8gb memory exit and there is no reference to
allocated memory

when i look at direct memory usage from jvisualvm it looks like
below(-Xmx512m -XX:MaxDirectMemorySize=12G)

[image: Inline images 1]

but RSS of the process is 16 GB. If i call System.gc at that point RSS
drops to 8GB but not to expected point.

This is why Apache cassandra guys select sun.misc.Unsafe(
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Off-heap-caching-through-ByteBuffer-allocateDirect-when-JNA-not-available-td6977711.html
).

I think currently only way to limit memory usage in flink if you want to
use same taskmanager across jobs is via "taskmanager.memory.preallocate:
true". Since it allocate memory at the beginning and not freed its memory
usage stays constant.

PS: Sorry for my english i am not a native speaker. I hope i can explain
what i intended to :)



On 18 June 2016 at 16:36, CPC <achalil@gmail.com> wrote:

> Hello,
>
> I repeated the same test with conf values.
>
>> taskmanager.heap.mb: 6500
>>
>> taskmanager.memory.off-heap: true
>>
>> taskmanager.memory.fraction: 0.9
>>
>>
> i set TM_MAX_OFFHEAP_SIZE="6G" in taskmanager sh. Taskmanager started with
>
>> capacman 14543  323 56.0 17014744 13731328 pts/1 Sl 16:23  35:25
>> /home/capacman/programlama/java/jdk1.7.0_75/bin/java
>> -XX:+UseConcMarkSweepGC -XX:+CMSClassUnloadingEnabled -Xms650M -Xmx650M
>> -XX:MaxDirectMemorySize=6G -XX:MaxPermSize=256m
>> -Dlog.file=/home/capacman/Data/programlama/flink-1.0.3/log/flink-capacman-taskmanager-0-capacman-Aspire-V3-771.log
>> -Dlog4j.configuration=file:/home/capacman/Data/programlama/flink-1.0.3/conf/log4j.properties
>> -Dlogback.configurationFile=file:/home/capacman/Data/programlama/flink-1.0.3/conf/logback.xml
>> -classpath
>> /home/capacman/Data/programlama/flink-1.0.3/lib/flink-dist_2.11-1.0.3.jar:/home/capacman/Data/programlama/flink-1.0.3/lib/flink-python_2.11-1.0.3.jar:/home/capacman/Data/programlama/flink-1.0.3/lib/log4j-1.2.17.jar:/home/capacman/Data/programlama/flink-1.0.3/lib/slf4j-log4j12-1.7.7.jar:::
>> org.apache.flink.runtime.taskmanager.TaskManager --configDir
>> /home/capacman/Data/programlama/flink-1.0.3/conf
>>
>
> but memory usage reach up to 13Gb. Could somebodey explain me why memory
> usage is so high? I expect it to be at most 8GB with some jvm internal
> overhead.
>
> [image: Inline images 1]
>
> [image: Inline images 2]
>
> On 17 June 2016 at 20:26, CPC <achalil@gmail.com> wrote:
>
>> Hi,
>>
>> I am making some test about offheap memory usage and encounter an odd
>> behavior. My taskmanager heap limit is 12288 Mb and when i set
>> "taskmanager.memory.off-hep:true" for every job it allocates 11673 Mb off
>> heap area at most which is heapsize*0.95(value of
>> taskmanager.memory.fraction). But when i submit second job it allocated
>> another 11GB and does not free memory since MaxDirectMemorySize set to
>>  -XX:MaxDirectMemorySize=${TM_MAX_OFFHEAP_SIZE}"  which is
>> TM_MAX_OFFHEAP_SIZE="8388607T" and my laptop goes to swap then kernel oom
>> killed taskmanager. If i hit perform gc from visualvm between jobs then it
>> release direct memory but memory usage of taskmanager in ps command is
>> still around 20GB(RSS) and 27GB(virtual size)  in that case i could submit
>> my test job a few times without oom killed task manager but after 10 submit
>>  it killed taskmanager again.  I dont understand why jvm memory usage is
>> still high even if all direct memory released. Do you have any idea? Then
>>  i set MaxDirectMemorySize to 12 GB  in this case it freed direct memory
>> without any explicit gc triggering from visualvm but jvm process memory
>> usage was still high around 20GB(RSS) and 27GB(virtual size). After again
>> maybe 10 submit it killed taskmanager. I think this is a bug and make it
>> imposible to reuse taskmanagers without restarting them in standalone mode.
>>
>> [image: Inline images 1]
>>
>> [image: Inline images 2]
>>
>
>

Mime
  • Unnamed multipart/related (inline, None, 0 bytes)
View raw message