hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Phabricator (JIRA)" <>
Subject [jira] [Commented] (HIVE-4248) Implement a memory manager for ORC
Date Tue, 09 Apr 2013 04:04:17 GMT


Phabricator commented on HIVE-4248:

omalley has commented on the revision "HIVE-4248 [jira] Implement a memory manager for ORC".

  I agree that it can overshoot, but it won't likely be by that much. Of course the normal
case is that the dynamic partitions are distributed randomly, in which case the current version
will do fine. Granted, if the data is already sorted by the dynamic partition, it will not
do well.

  Ok, I'll add a check when we add a new partition. I was just concerned with each new partition
addition, it will take longer and longer to do all of the checks.


To: JIRA, omalley
Cc: kevinwilfong

> Implement a memory manager for ORC
> ----------------------------------
>                 Key: HIVE-4248
>                 URL:
>             Project: Hive
>          Issue Type: New Feature
>          Components: Serializers/Deserializers
>            Reporter: Owen O'Malley
>            Assignee: Owen O'Malley
>         Attachments: HIVE-4248.D9993.1.patch, HIVE-4248.D9993.2.patch
> With the large default stripe size (256MB) and dynamic partitions, it is quite easy for
users to run out of memory when writing ORC files. We probably need a solution that keeps
track of the total number of concurrent ORC writers and divides the available heap space between

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:

View raw message