hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "anishek (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HIVE-16904) during repl load for large number of partitions the metadata file can be huge and can lead to out of memory
Date Thu, 15 Jun 2017 05:38:00 GMT
anishek created HIVE-16904:
------------------------------

             Summary: during repl load for large number of partitions the metadata file can
be huge and can lead to out of memory 
                 Key: HIVE-16904
                 URL: https://issues.apache.org/jira/browse/HIVE-16904
             Project: Hive
          Issue Type: Sub-task
    Affects Versions: 3.0.0
            Reporter: anishek
            Assignee: anishek
             Fix For: 3.0.0


the metadata pertaining to a table + its partitions is stored in a single file, During repl
load all the data is loaded in memory in one shot and then individual partitions processed.
This can lead to huge memory overhead as the entire file is read in memory. try to deserialize
the partition objects with some sort of streaming json deserializer. 




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message