hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eric Wohlstadter (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HIVE-20203) Arrow SerDe leaks a DirectByteBuffer
Date Thu, 19 Jul 2018 21:08:00 GMT

     [ https://issues.apache.org/jira/browse/HIVE-20203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Eric Wohlstadter updated HIVE-20203:
------------------------------------
    Attachment: HIVE-20203.3.patch

> Arrow SerDe leaks a DirectByteBuffer
> ------------------------------------
>
>                 Key: HIVE-20203
>                 URL: https://issues.apache.org/jira/browse/HIVE-20203
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Eric Wohlstadter
>            Assignee: Eric Wohlstadter
>            Priority: Blocker
>         Attachments: HIVE-20203.1.patch, HIVE-20203.2.patch, HIVE-20203.3.patch
>
>
> ArrowColumnarBatchSerDe allocates an arrow NullableMapVector for each task that uses
the serde.
> The vector is a DirectByteBuffer allocated from Arrow's off-heap buffer pool.
> This buffer is never closed and leaks about 1K of physical memory for each task.
> This patch does three things:
>  # Ensure the buffer is closed when the RecordWriter for the task is closed. 
>  # Adds per-task memory accounting by assigning a ChildAllocator to each task from the
RootAllocator.
>  # Enforces that the ChildAllocator for a task has released all memory assigned to it,
when the task is completed. 
> The patch assumes that close() is always called on the RecordWriter when a task is finished
(even if there is a failure during task execution). 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message