flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Maximilian Michels (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-2773) OutOfMemoryError on YARN Session
Date Mon, 28 Sep 2015 20:59:04 GMT

    [ https://issues.apache.org/jira/browse/FLINK-2773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14933970#comment-14933970
] 

Maximilian Michels commented on FLINK-2773:
-------------------------------------------

The maximum direct memory size appears to be set too low. Do you have any logs available from
this run?

> OutOfMemoryError on YARN Session
> --------------------------------
>
>                 Key: FLINK-2773
>                 URL: https://issues.apache.org/jira/browse/FLINK-2773
>             Project: Flink
>          Issue Type: Bug
>          Components: YARN Client
>    Affects Versions: 0.10
>            Reporter: Fabian Hueske
>            Priority: Critical
>             Fix For: 0.10
>
>
> When running a Flink program on a detached YARN session using the latest master (commit
{{0b3ca57b41e09937b9e63f2f443834c8ad1cf497}}), I observed this {{OutOfMemoryError}}
> {code}
> java.lang.Exception: The data preparation for task 'CoGroup (coGroup-A68B765B7BAB4E29BF6816965A994776)'
, caused an error: Error obtaining the sorted input: Thread 'SortMerger Reading Thread' terminated
due to an exception: java.lang.OutOfMemoryError: Direct buffer memory
> 	at org.apache.flink.runtime.operators.RegularPactTask.run(RegularPactTask.java:464)
> 	at org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:354)
> 	at org.apache.flink.runtime.taskmanager.Task.run(Task.java:579)
> 	at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.RuntimeException: Error obtaining the sorted input: Thread 'SortMerger
Reading Thread' terminated due to an exception: java.lang.OutOfMemoryError: Direct buffer
memory
> 	at org.apache.flink.runtime.operators.sort.UnilateralSortMerger.getIterator(UnilateralSortMerger.java:607)
> 	at org.apache.flink.runtime.operators.RegularPactTask.getInput(RegularPactTask.java:1089)
> 	at org.apache.flink.runtime.operators.CoGroupDriver.prepare(CoGroupDriver.java:97)
> 	at org.apache.flink.runtime.operators.RegularPactTask.run(RegularPactTask.java:459)
> 	... 3 more
> Caused by: java.io.IOException: Thread 'SortMerger Reading Thread' terminated due to
an exception: java.lang.OutOfMemoryError: Direct buffer memory
> 	at org.apache.flink.runtime.operators.sort.UnilateralSortMerger$ThreadBase.run(UnilateralSortMerger.java:787)
> Caused by: org.apache.flink.runtime.io.network.netty.exception.LocalTransportException:
java.lang.OutOfMemoryError: Direct buffer memory
> 	at org.apache.flink.runtime.io.network.netty.PartitionRequestClientHandler.exceptionCaught(PartitionRequestClientHandler.java:153)
> 	at io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:246)
> 	at io.netty.channel.AbstractChannelHandlerContext.fireExceptionCaught(AbstractChannelHandlerContext.java:224)
> 	at io.netty.channel.ChannelInboundHandlerAdapter.exceptionCaught(ChannelInboundHandlerAdapter.java:131)
> 	at io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:246)
> 	at io.netty.channel.AbstractChannelHandlerContext.fireExceptionCaught(AbstractChannelHandlerContext.java:224)
> 	at io.netty.channel.ChannelInboundHandlerAdapter.exceptionCaught(ChannelInboundHandlerAdapter.java:131)
> 	at io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:246)
> 	at io.netty.channel.AbstractChannelHandlerContext.notifyHandlerException(AbstractChannelHandlerContext.java:737)
> 	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:310)
> 	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
> 	at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:846)
> 	at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
> 	at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
> 	at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
> 	at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
> 	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
> 	at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:112)
> 	at java.lang.Thread.run(Thread.java:745)
> Caused by: io.netty.handler.codec.DecoderException: java.lang.OutOfMemoryError: Direct
buffer memory
> 	at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:234)
> 	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
> 	... 9 more
> Caused by: java.lang.OutOfMemoryError: Direct buffer memory
> 	at java.nio.Bits.reserveMemory(Bits.java:658)
> 	at java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:123)
> 	at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:306)
> 	at io.netty.buffer.UnpooledUnsafeDirectByteBuf.allocateDirect(UnpooledUnsafeDirectByteBuf.java:108)
> 	at io.netty.buffer.UnpooledUnsafeDirectByteBuf.capacity(UnpooledUnsafeDirectByteBuf.java:157)
> 	at io.netty.buffer.AbstractByteBuf.ensureWritable(AbstractByteBuf.java:251)
> 	at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:849)
> 	at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:841)
> 	at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:831)
> 	at io.netty.handler.codec.ByteToMessageDecoder$1.cumulate(ByteToMessageDecoder.java:92)
> 	at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:228)
> 	... 10 more
> {code}
> Since I know, that this feature was properly working recently, I reverted to commit {{8ca853e0f6c18be8e6b066c6ec0f23badb797323}}
and the problem was gone.
> The problem might have been introduced when adding offheap memory support for YARN (commit
{{93c95b6a6f150a2c55dc387e4ef1d603b3ef3f22}}).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message