hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eric Wohlstadter (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-20203) Arrow SerDe leaks a DirectByteBuffer
Date Wed, 18 Jul 2018 18:54:00 GMT

    [ https://issues.apache.org/jira/browse/HIVE-20203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16548266#comment-16548266
] 

Eric Wohlstadter commented on HIVE-20203:
-----------------------------------------

[~teddy.choi]

Can you please review the patch?

The patch is covered by the existing TestJdbcWithMiniLlapArrow,

i.e. if the memory leak still exists, LlapArrowRecordWriter would throw an exception during
the execution of TestJdbcWithMiniLlapArrow.

/cc [~mmccline]

> Arrow SerDe leaks a DirectByteBuffer
> ------------------------------------
>
>                 Key: HIVE-20203
>                 URL: https://issues.apache.org/jira/browse/HIVE-20203
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Eric Wohlstadter
>            Assignee: Eric Wohlstadter
>            Priority: Blocker
>         Attachments: HIVE-20203.1.patch
>
>
> ArrowColumnarBatchSerDe allocates an arrow NullableMapVector for each task that uses
the serde.
> The vector is a DirectByteBuffer allocated from Arrow's off-heap buffer pool.
> This buffer is never closed and leaks about 1K of physical memory for each task.
> This patch does three things:
>  # Ensure the buffer is closed when the RecordWriter for the task is closed. 
>  # Adds per-task memory accounting by assigning a ChildAllocator to each task from the
RootAllocator.
>  # Enforces that the ChildAllocator for a task has released all memory assigned to it,
when the task is completed. 
> The patch assumes that close() is always called on the RecordWriter when a task is finished
(even if their is a failure during task execution). 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message