hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vihang Karajgaonkar (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-19041) Thrift deserialization of Partition objects should intern fields
Date Sat, 05 May 2018 01:25:00 GMT

    [ https://issues.apache.org/jira/browse/HIVE-19041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16464550#comment-16464550
] 

Vihang Karajgaonkar commented on HIVE-19041:
--------------------------------------------

v3 of the patch includes additional thrift generated files which operate on list of partitions
like {{AddPartitionsRequest}}, {{DropPartitionsRequest}}. Also found that some of the statistics
related thrift classes do not intern repeated fields. I wish there was some way to tell thrift
to do this instead of us keeping up with newly added thrift classes.

> Thrift deserialization of Partition objects should intern fields
> ----------------------------------------------------------------
>
>                 Key: HIVE-19041
>                 URL: https://issues.apache.org/jira/browse/HIVE-19041
>             Project: Hive
>          Issue Type: Improvement
>          Components: Metastore
>    Affects Versions: 3.0.0, 2.3.2
>            Reporter: Vihang Karajgaonkar
>            Assignee: Vihang Karajgaonkar
>            Priority: Major
>         Attachments: HIVE-19041.01.patch, HIVE-19041.02.patch, HIVE-19041.03.patch
>
>
> When a client is creating large number of partitions, the thrift objects are deserialized
into Partition objects. The read method of these objects does not intern the inputformat,
location, outputformat which cause large number of duplicate Strings in the HMS memory. We
should intern these objects while deserialization to reduce memory pressure. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message