drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Parth Chandra (JIRA)" <j...@apache.org>
Subject [jira] [Created] (DRILL-1504) Enabling fragment memory limit causes out of memory error
Date Tue, 07 Oct 2014 22:36:33 GMT
Parth Chandra created DRILL-1504:

             Summary: Enabling fragment memory limit causes out of memory error
                 Key: DRILL-1504
                 URL: https://issues.apache.org/jira/browse/DRILL-1504
             Project: Apache Drill
          Issue Type: Bug
          Components: Execution - Flow
    Affects Versions: 0.6.0

            Reporter: Parth Chandra
             Fix For: 0.7.0

When fragment memory limit is enabled, running a query with a large number of fragments hits
the fragment memory limit after being run a few times. 
It appears there are two problems - 1) At the end of the query, the drillbit does not reset
the fragment limit to the amount before the query was run, and 2) the fragment limit seems
to be smaller than expected.

The cause seems to be the following -
When a drillbit receives a request for a fragmentRecordBatch, the BitServer threads create
a NonRootFragmentManager object each, corresponding to the FragmentHandle. Only one of the
NonRootFragmentManager objects is actually used, the others are discarded and garbage collected.

However, when fragment memory limit is enabled, the Allocator corresponding to each of these
nonRootFragmentManager objects registers the corresponding FragmentContext with the top level
allocator which then uses this information to recalculate the fragment limit.
This has two effects - 1) the top level allocator counts more fragments because it counts
each fragment multiple times. 2) The top level allocator keeps a reference to the fragment
context which prevents the object from being garbage collected. Worse, since no code actually
'closes' the fragment context, these objects remain registered with the top level allocator
across queries, eventually causing an out of memory condition.

This message was sent by Atlassian JIRA

View raw message