hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Chang (Created) (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HIVE-2575) Hive runs out of memory with a large number of partitions
Date Fri, 11 Nov 2011 21:58:57 GMT
Hive runs out of memory with a large number of partitions
---------------------------------------------------------

                 Key: HIVE-2575
                 URL: https://issues.apache.org/jira/browse/HIVE-2575
             Project: Hive
          Issue Type: Bug
            Reporter: Jonathan Chang


When a large number of partitions needs to be fetched for a query (say ~10k), it will take
several minutes for the query plan to even be generated and the client will often run out
of memory.

Some quick investigation shows that the partition pruner is relatively speedy, but the actual
fetch of the partitions is quite slow with most of the time being spent in DataNucleus generated
code.  It also looks like the amount of data that needs to be pulled and stored for each Partition
object is quite large.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message