hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sahil Takiar (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HIVE-17638) SparkDynamicPartitionPruner loads all partition metadata into memory
Date Thu, 28 Sep 2017 19:10:00 GMT
Sahil Takiar created HIVE-17638:
-----------------------------------

             Summary: SparkDynamicPartitionPruner loads all partition metadata into memory
                 Key: HIVE-17638
                 URL: https://issues.apache.org/jira/browse/HIVE-17638
             Project: Hive
          Issue Type: Sub-task
          Components: Spark
            Reporter: Sahil Takiar


The {{SparkDynamicPartitionPruner}} first loads the contents of each partition pruning file
into memory, and then prunes all the partitions from the {{MapWork}}. This can cause increased
memory pressure on the HoS Remote Driver because it requires loading all the partition metadata
into memory. It would be more efficient if pruning of partitions was done while scanning the
files, so that all the partition metadata doesn't need to be buffered in memory.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message