drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Paul Rogers (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (DRILL-5312) "Record batch sizer" does not include overhead for variable-sized vectors
Date Thu, 02 Mar 2017 23:32:45 GMT

     [ https://issues.apache.org/jira/browse/DRILL-5312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Paul Rogers updated DRILL-5312:
-------------------------------
    Description: 
The new "record batch sizer" computes the actual data size of a record given a batch of vectors.
For most purposes, the record width must include the overhead of the offset vectors for variable-sized
vectors. The initial code drop included only the character data, but not the offset vector
size when computing row width.

Since the "managed" external sort relies on the computed row size to determine memory usage,
the underestimation of row count width can cause an OOM under certain low-memory conditions.

  was:The new "record batch sizer" computes the actual data size of a record given a batch
of vectors. For most purposes, the record width must include the overhead of the offset vectors
for variable-sized vectors. The initial code drop included only the character data, but not
the offset vector size when computing row width.


> "Record batch sizer" does not include overhead for variable-sized vectors
> -------------------------------------------------------------------------
>
>                 Key: DRILL-5312
>                 URL: https://issues.apache.org/jira/browse/DRILL-5312
>             Project: Apache Drill
>          Issue Type: Bug
>    Affects Versions: 1.10.0
>            Reporter: Paul Rogers
>            Assignee: Paul Rogers
>             Fix For: 1.10.0
>
>
> The new "record batch sizer" computes the actual data size of a record given a batch
of vectors. For most purposes, the record width must include the overhead of the offset vectors
for variable-sized vectors. The initial code drop included only the character data, but not
the offset vector size when computing row width.
> Since the "managed" external sort relies on the computed row size to determine memory
usage, the underestimation of row count width can cause an OOM under certain low-memory conditions.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message