drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Paul Rogers (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-5211) External sort fails to allocate merge memory when plenty is free
Date Mon, 23 Jan 2017 06:18:26 GMT

    [ https://issues.apache.org/jira/browse/DRILL-5211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15833947#comment-15833947
] 

Paul Rogers commented on DRILL-5211:
------------------------------------

For this use case, input is 18 GB. Data arrives at the sort in batches of size 128 MB. The
vectors that make up the batch are:

* Offsets, length of 32768
* RepeatedVarCharVector, length of 67,108,864

Despite this, the total size, when shifted into the sort, is reported as 134,340,608 MB.

This seems an incredible waste of space; a very large source of internal fragmentation, perhaps
due to the power-of-two allocation rule. (Though 67,108,864 is, itself, a power of two...)

That aside, we see that the incoming vector is far larger than the 16 MB used in the free
chunk list.

> External sort fails to allocate merge memory when plenty is free
> ----------------------------------------------------------------
>
>                 Key: DRILL-5211
>                 URL: https://issues.apache.org/jira/browse/DRILL-5211
>             Project: Apache Drill
>          Issue Type: Bug
>            Reporter: Paul Rogers
>            Assignee: Paul Rogers
>             Fix For: 1.9.0
>
>
> Consider a test of the external sort as follows:
> * Direct memory: 3GB
> * Input file: 18 GB, with one Varchar column of 8K width
> The sort runs, spilling to disk. Once all data arrives, the sort beings to merge the
results. But, to do that, it must first do an intermediate merge. For example, in this sort,
there are 190 spill files, but only 19 can be merged at a time. (Each merge file contains
128 MB batches, and only 19 can fit in memory, giving a total footprint of 2.5 GB, well below
the 3 GB limit.
> Yet, when loading batch xx, Drill fails with an OOM error. At that point, total available
direct memory is 3,817,865,216. (Obtained from {{maxMemory}} in the {{Bits}} class in the
JDK.)
> It appears that Drill wants to allocate 58,257,868 bytes, but the {{totalCapacity}} (again
in {{Bits}}) is already 3,800,769,206, causing an OOM.
> The problem is that, at this point, the external sort should not ask the system for more
memory. The allocator for the external sort is at just 1,192,350,366 before the allocation
request. Plenty of spare memory should be available, released when the in-memory batches were
spilled to disk prior to merging. Indeed, earlier in the run, the sort had reached a peak
memory usage of 2,710,716,416 bytes. This memory should be available for reuse during merging,
and is plenty sufficient to fill the particular request in question.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message