drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Paul Rogers (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (DRILL-5100) External Sort does not manage memory requirements of a schema change
Date Sun, 04 Dec 2016 00:31:58 GMT

     [ https://issues.apache.org/jira/browse/DRILL-5100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Paul Rogers updated DRILL-5100:
-------------------------------
    Issue Type: Sub-task  (was: Bug)
        Parent: DRILL-5080

> External Sort does not manage memory requirements of a schema change
> --------------------------------------------------------------------
>
>                 Key: DRILL-5100
>                 URL: https://issues.apache.org/jira/browse/DRILL-5100
>             Project: Apache Drill
>          Issue Type: Sub-task
>    Affects Versions: 1.8.0
>            Reporter: Paul Rogers
>            Assignee: Paul Rogers
>
> The external sort is given a fixed amount of memory to hold buffered in-memory batches
prior to spilling. External sort also handles certain schema changes when union vectors are
enabled. When a schema change occurs, existing vectors are coerced into the new schema format,
perhaps replacing an existing vector with a new union vector.
> This conversion requires (direct) memory. When done when the external sort has already
almost filled its in-memory buffer, the conversion process can cause memory overflow and failure.
> The following show the allocated memory before and after schema changes in the unit tests
{{TestExternalSort.testNumericTypes}}:
> {code}
> Before: 134144
> After: 150528
> Before: 150528
> After: 166912
> {code}
> Union vectors appear to be larger than the original BIGINT vectors. External sort must
anticipate this and perhaps spill to ensure sufficient room exists for the new, larger vectors.
> Further, the conversion process itself requires that two copies of each vector be in
memory: the original and the new, converted one. The external sort does not check to ensure
this much working memory is available, leading to potential OOM errors during each vector
conversion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message