hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Owen O'Malley (JIRA)" <>
Subject [jira] [Updated] (HIVE-4421) Improve memory usage by ORC dictionaries
Date Thu, 25 Apr 2013 17:46:16 GMT


Owen O'Malley updated HIVE-4421:

    Fix Version/s: 0.11.0
           Status: Patch Available  (was: Open)

This patch does three things:
* Improves the memory usage while writing ORC dictionaries by removing the counts and just
storing offsets instead of offsets and lengths.
* Improves the tracking of how much memory is used by the dictionaries by tracking the allocation
rather than the usage.
* Reduces the size of some of the allocation sizes of the integer arrays.
> Improve memory usage by ORC dictionaries
> ----------------------------------------
>                 Key: HIVE-4421
>                 URL:
>             Project: Hive
>          Issue Type: Bug
>          Components: Serializers/Deserializers
>            Reporter: Owen O'Malley
>            Assignee: Owen O'Malley
>             Fix For: 0.11.0
>         Attachments: HIVE-4421.D10545.1.patch
> Currently, for tables with many string columns, it is possible to significantly underestimate
the memory used by the ORC dictionaries and cause the query to run out of memory in the task.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:

View raw message