hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Siying Dong (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HIVE-1758) optimize group by hash map memory
Date Wed, 10 Nov 2010 04:56:17 GMT

     [ https://issues.apache.org/jira/browse/HIVE-1758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Siying Dong updated HIVE-1758:
------------------------------

    Status: Patch Available  (was: Open)

> optimize group by hash map memory
> ---------------------------------
>
>                 Key: HIVE-1758
>                 URL: https://issues.apache.org/jira/browse/HIVE-1758
>             Project: Hive
>          Issue Type: Improvement
>          Components: Query Processor
>            Reporter: Namit Jain
>            Assignee: Siying Dong
>         Attachments: HIVE-1758.1.patch
>
>
> Group By map side's hash map consumes a lot of memory, thereby decreasing its effectiveness.
> We can use some of the optimizations from map-join to reduce the memory footprint:
>   class KeyWrapper {
>     int hashcode;
>     ArrayList<Object> keys;
>     // decide whether this is already in hashmap (keys in hashmap are deepcopied
>     // version, and we need to use 'currentKeyObjectInspector').
>     boolean copy = false;
> 1. Changes keys to Array
> 2. Optimize the scenario when keys is of a small size (1,2) etc
> Let us start profiling it and take it from there

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message