drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-5967) Memory leak by HashPartitionSender
Date Sat, 23 Dec 2017 05:52:00 GMT

    [ https://issues.apache.org/jira/browse/DRILL-5967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16302225#comment-16302225
] 

ASF GitHub Bot commented on DRILL-5967:
---------------------------------------

Github user ilooner commented on a diff in the pull request:

    https://github.com/apache/drill/pull/1073#discussion_r158577033
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/PartitionSenderRootExec.java
---
    @@ -341,6 +351,10 @@ public void close() throws Exception {
           updateAggregateStats();
           partitioner.clear();
         }
    +
    +    if (closeIncoming) {
    +      ((CloseableRecordBatch) incoming).close();
    +    }
    --- End diff --
    
    Hi Paul, I understand the FragmentExecutor closes all the operators. However, the FragmentExecutor
is only aware of the operators in the tree that it has a reference too. In the case of the
OrderedPartitionSenderCreator, an upstream record batch is wrapped in an OrderedPartitionRecordBatch.
Since the OrderedPartitionRecordBatch is instantiated in the creator the FragmentExecutor
does not have a reference to it. So when the FragmentExecutor closes the operators it closes
the original operator, but not not the wrapping OrderedPartitionRecordBatch. This is an issue
since the OrderedPartitionRecordBatch allocates VectorContainers which are consequentially
never released. This is clearly the source of the memory leak and the fix was verified by
QA.
    
    There are two possible Fixes. 
    
    A) We change the Creators in some way to communicate to the FragmentExecutor that they
have wrapped an operator, so the FragmentExecutor can close the wrapped operator instead of
the original operator. My impression is that this will evolve into a fairly large change.
    
    B) Or we take a less invasive approach and simply tell the PartitionSenderRootExec whether
to close the wrapped operator (This is what I did).
    
    I agree approach B is not that great and that approach A is cleaner. But taking approach
A is a more intensive project, and I was not keen on having another simple change explode
into another 100 file PR. It would be easier to make that change after we've made more improvements
to Drill unit testing.
    
    Let me know what you think. I am not against taking approach A, but then we'll have to
talk with Pritesh about fitting that project in.


> Memory leak by HashPartitionSender
> ----------------------------------
>
>                 Key: DRILL-5967
>                 URL: https://issues.apache.org/jira/browse/DRILL-5967
>             Project: Apache Drill
>          Issue Type: Bug
>            Reporter: Timothy Farkas
>            Assignee: Timothy Farkas
>
> The error found by [~cchang@maprtech.com] and [~dechanggu]
> {code}
> 2017-10-25 15:43:28,658 [260eec84-7de3-03ec-300f-7fdbc111fb7c:frag:2:9] ERROR o.a.d.e.w.fragment.FragmentExecutor
- SYSTEM ERROR: IllegalStateException: Memory was leaked by query. Memory leaked: (9216)
> Allocator(op:2:9:0:HashPartitionSender) 1000000/9216/12831744/10000000000 (res/actual/peak/limit)
> Fragment 2:9
> [Error Id: 7eae6c2a-868c-49f8-aad8-b690243ffe9b on mperf113.qa.lab:31010]
> org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: IllegalStateException:
Memory was leaked by query. Memory leaked: (9216)
> Allocator(op:2:9:0:HashPartitionSender) 1000000/9216/12831744/10000000000 (res/actual/peak/limit)
> Fragment 2:9
> [Error Id: 7eae6c2a-868c-49f8-aad8-b690243ffe9b on mperf113.qa.lab:31010]
>         at org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:586)
~[drill-common-1.11.0-mapr.jar:1.11.0-mapr]
>         at org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:301)
[drill-java-exec-1.11.0-mapr.jar:1.11.0-mapr]
>         at org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:160)
[drill-java-exec-1.11.0-mapr.jar:1.11.0-mapr]
>         at org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:267)
[drill-java-exec-1.11.0-mapr.jar:1.11.0-mapr]
>         at org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
[drill-common-1.11.0-mapr.jar:1.11.0-mapr]
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
[na:1.8.0_121]
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
[na:1.8.0_121]
>         at java.lang.Thread.run(Thread.java:745) [na:1.8.0_121]
> Caused by: java.lang.IllegalStateException: Memory was leaked by query. Memory leaked:
(9216)
> Allocator(op:2:9:0:HashPartitionSender) 1000000/9216/12831744/10000000000 (res/actual/peak/limit)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message