hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Liyin Tang (JIRA)" <>
Subject [jira] Updated: (HIVE-1754) Remove JDBM component from Map Join
Date Tue, 02 Nov 2010 02:29:26 GMT


Liyin Tang updated HIVE-1754:

    Status: Patch Available  (was: Open)

This patch modifies the following things
1) Remove the JDBM from Hive
2) All the data in the small table will be stored in in-memory hashtable.
3) Create a light-weight RowContainer: MapJoinRowContainer.
4) Optimize MapJoinObjectKey. If there are only one join key or two join keys, it will use
MapJoinSingleKey or MapJoinDoulbeKeys instead of MapJoinObjectKey.

> Remove JDBM component from Map Join
> -----------------------------------
>                 Key: HIVE-1754
>                 URL:
>             Project: Hive
>          Issue Type: Improvement
>          Components: Query Processor
>    Affects Versions: 0.6.0, 0.7.0
>            Reporter: Liyin Tang
>            Assignee: Liyin Tang
>             Fix For: 0.7.0
>         Attachments: Hive-1754.patch
> Right now, JDBM is the major performance bottleneck of performance.
> With the growth of the small table, the PUT and GET operation will take most of execution
> Map Join is designed to load the data of small table into memory. 
> If the data is too large to hold in memory, then there is no need to use the map join

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message