hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Liyin Tang (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HIVE-1754) Remove JDBM component from Map Join
Date Tue, 02 Nov 2010 02:29:26 GMT

     [ https://issues.apache.org/jira/browse/HIVE-1754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Liyin Tang updated HIVE-1754:
-----------------------------

    Status: Patch Available  (was: Open)

This patch modifies the following things
1) Remove the JDBM from Hive
2) All the data in the small table will be stored in in-memory hashtable.
3) Create a light-weight RowContainer: MapJoinRowContainer.
4) Optimize MapJoinObjectKey. If there are only one join key or two join keys, it will use
MapJoinSingleKey or MapJoinDoulbeKeys instead of MapJoinObjectKey.

> Remove JDBM component from Map Join
> -----------------------------------
>
>                 Key: HIVE-1754
>                 URL: https://issues.apache.org/jira/browse/HIVE-1754
>             Project: Hive
>          Issue Type: Improvement
>          Components: Query Processor
>    Affects Versions: 0.6.0, 0.7.0
>            Reporter: Liyin Tang
>            Assignee: Liyin Tang
>             Fix For: 0.7.0
>
>         Attachments: Hive-1754.patch
>
>
> Right now, JDBM is the major performance bottleneck of performance.
> With the growth of the small table, the PUT and GET operation will take most of execution
time.
> Map Join is designed to load the data of small table into memory. 
> If the data is too large to hold in memory, then there is no need to use the map join
strategy.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message