drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Paul Rogers (JIRA)" <j...@apache.org>
Subject [jira] [Created] (DRILL-5273) ScanBatch exhausts memory when reading 5000 small files
Date Fri, 17 Feb 2017 06:16:41 GMT
Paul Rogers created DRILL-5273:
----------------------------------

             Summary: ScanBatch exhausts memory when reading 5000 small files
                 Key: DRILL-5273
                 URL: https://issues.apache.org/jira/browse/DRILL-5273
             Project: Apache Drill
          Issue Type: Bug
    Affects Versions: 1.10
            Reporter: Paul Rogers
            Assignee: Paul Rogers
             Fix For: 1.10


A test case was created that consists of 5000 text files, each with a single line with the
file number: 1 to 5001. Each file has a single record, and at most 4 characters per record.

Run the following query:

{code}
SELECT * FROM `dfs.data`.`5000files/text
{code}

The query will fail with an OOM in the scan batch on around record 3700 on a Mac with 4GB
of direct memory.

The code to read records in {ScanBatch} is complex. The following appears to occur:

* Iterate over the record readers for each file.
* For each, call setup

The setup code is:
{code}
  public void setup(OperatorContext context, OutputMutator outputMutator) throws ExecutionSetupException
{

    oContext = context;
    readBuffer = context.getManagedBuffer(READ_BUFFER);
    whitespaceBuffer = context.getManagedBuffer(WHITE_SPACE_BUFFER);
{code}

The two buffers are in direct memory. There is no code that releases the buffers.

The sizes are:

{code}
  private static final int READ_BUFFER = 1024*1024;
  private static final int WHITE_SPACE_BUFFER = 64*1024;

= 1,048,576 + 65536 = 1,114,112
{code}

This is exactly the amount of memory that accumulates per call to {{ScanBatch.next()}}

{code}
Ctor: 0  -- Initial memory in constructor
Init setup: 1114112  -- After call to first record reader setup
Entry Memory: 1114112  -- first next() call, returns one record
Entry Memory: 1114112  -- second next(), eof and start second reader
Entry Memory: 2228224 -- third next(), second reader returns EOF
...
{code}

If we leak 1 MB per file, with 5000 files we would leak 5 GB of memory, which would explain
the OOM when given only 4 GB.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message