drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Parth Chandra (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-1480) severe memory leak query snappy compressed parquet file
Date Thu, 06 Nov 2014 23:19:35 GMT

    [ https://issues.apache.org/jira/browse/DRILL-1480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14201180#comment-14201180
] 

Parth Chandra commented on DRILL-1480:
--------------------------------------

This problem is not limited to snappy or compressed files. Direct memory in use by Drill goes
up over time until the drillbit runs out of memory.
The problem with increasing memory is a result of Drill's use of Netty. Netty provides a memory
allocator that divides allocated memory into thread arenas, where each thread gets memory
from a specific arena, so that two independent threads are able to allocate memory from the
pool without synchronization overhead. In addition Netty 4.0.20 uses a thread pool memory
cache to minimize synchronization overhead. When a thread allocates memory the allocator looks
for memory in its arenas cache corresponding to the threads arena and if found returns it
otherwise it get memory from the arena. When memory is released, the memory is added to the
arenas cache.
This works fine when the same thread allocates and releases memory. In Drill, memory allocated
by one thread may be passed on to another thread and the last thread to use the memory, eventually
releases it. The result is to move memory belonging to the allocating thread's arena to the
releasing thread's arena cache. Since threads are reused across queries, effectively, the
allocating threads are constantly 'losing' memory to the releasing threads and therefore keep
allocating more.
Netty 4.0.24 fixes this. See https://github.com/netty/netty/pull/2855

Fix is to upgrade to Netty 4.0.24

Running out of heap memory (as indicated in one of the log entries above) is not related to
this problem.

A patch for this is here:
https://github.com/parthchandra/incubator-drill/commit/aacc63320d4e75e6c6ef98751cd8e793935f2b85

Includes a fix for a minor problem in the Parquet reader discovered during debugging.





> severe memory leak query snappy compressed parquet file
> -------------------------------------------------------
>
>                 Key: DRILL-1480
>                 URL: https://issues.apache.org/jira/browse/DRILL-1480
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Execution - Data Types
>    Affects Versions: 0.6.0
>            Reporter: Chun Chang
>            Assignee: Parth Chandra
>            Priority: Blocker
>             Fix For: 0.7.0
>
>
> #Wed Oct 01 00:19:24 EDT 2014
> git.commit.id.abbrev=5c220e3
> Running TPCH query #03, drill bit shows severe memory leak and quickly ran out of memory:
> 2014-10-02 00:51:21,520 [WorkManager-116] ERROR o.apache.drill.exec.work.WorkManager
- Failure while running wrapper [FragmentExecutor: 7d345235-0eb4-4189-b34f-f535fa5ad1bb:4:10]
> java.lang.OutOfMemoryError: Direct buffer memory
>         at java.nio.Bits.reserveMemory(Bits.java:658) ~[na:1.7.0_65]
>         at java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:123) ~[na:1.7.0_65]
>         at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:306) ~[na:1.7.0_65]
>         at io.netty.buffer.PoolArena$DirectArena.newChunk(PoolArena.java:434) ~[netty-buffer-4.0.20.Final.jar:4.0.20.Final]
>         at io.netty.buffer.PoolArena.allocateNormal(PoolArena.java:179) ~[netty-buffer-4.0.20.Final.jar:4.0.20.Final]
>         at io.netty.buffer.PoolArena.allocate(PoolArena.java:168) ~[netty-buffer-4.0.20.Final.jar:4.0.20.Final]
>         at io.netty.buffer.PoolArena.allocate(PoolArena.java:98) ~[netty-buffer-4.0.20.Final.jar:4.0.20.Final]
>         at io.netty.buffer.PooledByteBufAllocatorL.newDirectBuffer(PooledByteBufAllocatorL.java:46)
~[drill-java-exec-0.6.0-incubating-SNAPSHOT-rebuffed.jar:4.0.20.Final]
>         at io.netty.buffer.PooledByteBufAllocatorL.directBuffer(PooledByteBufAllocatorL.java:66)
~[drill-java-exec-0.6.0-incubating-SNAPSHOT-rebuffed.jar:4.0.20.Final]
>         at org.apache.drill.exec.memory.TopLevelAllocator$ChildAllocator.buffer(TopLevelAllocator.java:205)
~[drill-java-exec-0.6.0-incubating-SNAPSHOT-rebuffed.jar:0.6.0-incubating-SNAPSHOT]
>         at org.apache.drill.exec.memory.TopLevelAllocator$ChildAllocator.buffer(TopLevelAllocator.java:212)
~[drill-java-exec-0.6.0-incubating-SNAPSHOT-rebuffed.jar:0.6.0-incubating-SNAPSHOT]
>         at org.apache.drill.exec.vector.IntVector.allocateNew(IntVector.java:149) ~[drill-java-exec-0.6.0-incubating-SNAPSHOT-rebuffed.jar:0.6.0-incubating-SNAPSHOT]
>         at org.apache.drill.exec.test.generated.HashTableGen478.allocMetadataVector(HashTableTemplate.java:728)
~[na:na]
>         at org.apache.drill.exec.test.generated.HashTableGen478.access$200(HashTableTemplate.java:41)
~[na:na]
>         at org.apache.drill.exec.test.generated.HashTableGen478$BatchHolder.<init>(HashTableTemplate.java:132)
~[na:na]
>         at org.apache.drill.exec.test.generated.HashTableGen478$BatchHolder.<init>(HashTableTemplate.java:101)
~[na:na]
>         at org.apache.drill.exec.test.generated.HashTableGen478.addBatchHolder(HashTableTemplate.java:654)
~[na:na]
>         at org.apache.drill.exec.test.generated.HashTableGen478.put(HashTableTemplate.java:494)
~[na:na]
>         at org.apache.drill.exec.physical.impl.join.HashJoinBatch.executeBuildPhase(HashJoinBatch.java:344)
~[drill-java-exec-0.6.0-incubating-SNAPSHOT-rebuffed.jar:0.6.0-incubating-SNAPSHOT]
>         at org.apache.drill.exec.physical.impl.join.HashJoinBatch.innerNext(HashJoinBatch.java:193)
~[drill-java-exec-0.6.0-incubating-SNAPSHOT-rebuffed.jar:0.6.0-incubating-SNAPSHOT]
> The query runs fine against uncompressed parquet file of same 100G scale factor. Here
is the query:
> [root@atsqa8c21 testcases]# cat 03.q
> -- tpch3 using 1395599672 as a seed to the RNG
> select
>   l.l_orderkey,
>   sum(l.l_extendedprice * (1 - l.l_discount)) as revenue,
>   o.o_orderdate,
>   o.o_shippriority
> from
>   customer c,
>   orders o,
>   lineitem l
> where
>   c.c_mktsegment = 'HOUSEHOLD'
>   and c.c_custkey = o.o_custkey
>   and l.l_orderkey = o.o_orderkey
>   and o.o_orderdate < date '1995-03-25'
>   and l.l_shipdate > date '1995-03-25'
> group by
>   l.l_orderkey,
>   o.o_orderdate,
>   o.o_shippriority
> order by
>   revenue desc,
>   o.o_orderdate
> limit 10;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message