drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steven Phillips (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-2847) DrillBufs from the RPC layer are being leaked
Date Thu, 23 Apr 2015 00:36:39 GMT

    [ https://issues.apache.org/jira/browse/DRILL-2847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14508222#comment-14508222
] 

Steven Phillips commented on DRILL-2847:
----------------------------------------

My thoughts:

There are basically two different stack traces in Chris' log:

{code}
io.netty.buffer.UnsafeDirectLittleEndian.<init>:91
io.netty.buffer.PooledByteBufAllocatorL.newDirectBuffer:52
io.netty.buffer.PooledByteBufAllocatorL.directBuffer:66
io.netty.buffer.PooledByteBufAllocatorL.directBuffer:1
io.netty.buffer.AbstractByteBufAllocator.directBuffer:141
io.netty.buffer.AbstractByteBufAllocator.buffer:75
org.apache.drill.exec.rpc.RpcEncoder.encode:87
org.apache.drill.exec.rpc.RpcEncoder.encode:1
io.netty.handler.codec.MessageToMessageEncoder.write:89
io.netty.channel.AbstractChannelHandlerContext.invokeWrite:658
io.netty.channel.AbstractChannelHandlerContext.access$2000:32
io.netty.channel.AbstractChannelHandlerContext$AbstractWriteTask.write:939
io.netty.channel.AbstractChannelHandlerContext$WriteAndFlushTask.write:991
io.netty.channel.AbstractChannelHandlerContext$AbstractWriteTask.run:924
io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks:380
io.netty.channel.nio.NioEventLoop.run:357
io.netty.util.concurrent.SingleThreadEventExecutor$2.run:116
{code}

I believe this is a buffer allocated by Netty for reading off the socket. I think it is expected
that this would be here, because Netty will reuse these buffers. However, this buffer is not
allocated through TopLevelAllocator, so Drill is unable to account for it, which I think is
a bit problematic. But if the size and number of these buffers is small, that is probably
acceptable. We should investigate and confirm whether the number of these kinds of buffers
is small and bounded.

The other one:

{code}
io.netty.buffer.UnsafeDirectLittleEndian.<init>:91
io.netty.buffer.PooledByteBufAllocatorL.newDirectBuffer:52
io.netty.buffer.PooledByteBufAllocatorL.directBuffer:66
org.apache.drill.exec.memory.TopLevelAllocator.buffer:94
org.apache.drill.exec.memory.TopLevelAllocator.buffer:102
org.apache.drill.exec.rpc.ProtobufLengthDecoder.decode:83
org.apache.drill.exec.rpc.data.DataProtobufLengthDecoder$Server.decode:52
io.netty.handler.codec.ByteToMessageDecoder.callDecode:247
io.netty.handler.codec.ByteToMessageDecoder.channelRead:147
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead:333
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead:319
io.netty.channel.ChannelInboundHandlerAdapter.channelRead:86
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead:333
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead:319
io.netty.channel.DefaultChannelPipeline.fireChannelRead:787
io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read:130
io.netty.channel.nio.NioEventLoop.processSelectedKey:511
io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized:468
io.netty.channel.nio.NioEventLoop.processSelectedKeys:382
io.netty.channel.nio.NioEventLoop.run:354
io.netty.util.concurrent.SingleThreadEventExecutor$2.run:116
{code}

Perhaps my thinking is wrong, but I think we should not be seeing these left over. This is
not the socket read buffer, but rather the buffer that the rpc layer copies data into after
reading data off the wire.

Please correct me if my understanding is wrong.

> DrillBufs from the RPC layer are being leaked
> ---------------------------------------------
>
>                 Key: DRILL-2847
>                 URL: https://issues.apache.org/jira/browse/DRILL-2847
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Execution - Flow, Execution - RPC
>    Affects Versions: 0.9.0
>            Reporter: Chris Westin
>            Assignee: Jacques Nadeau
>             Fix For: 1.0.0
>
>         Attachments: DRILL-2847-bug.2.patch.txt, DRILL-2847-bug.patch.txt, drill-mem.log
>
>
> I've created a patch that demonstrates this. In the patch, code is added to UnsafeDirectLittleEndian
to track all the instances of that class that are created (which happens when buffers are
allocated inside TopLevelAllocator). release() is overridden to remove these from the tracked
list when they are released. An @After action is added to TestTpchDistributed which checks
on the count of outstanding buffers. If the test is run, every case fails, and each failure
shows a progressively larger and larger number of outstanding buffers. There are no complaints
from the allocator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message