hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ning Zhang (JIRA)" <>
Subject [jira] Commented: (HIVE-963) no out of memory errors for skewed join
Date Fri, 04 Dec 2009 01:07:20 GMT


Ning Zhang commented on HIVE-963:

@ragu, I think what you proposed is the similar to what Zheng proposed to implement a lightweight
persistent hash table rather than using JDBM. I think that can be done in the second stage
(the next JIRA?). We can replace the JDBM in the wrapper by the lightweight persistent hash

Basically we need the hash table in the case of joinning of multiple tables, where the key
is the table tag. Of couse hash map can be avoided by using list of arrays, if we know the
table tags are always small integers. 

> no out of memory errors for skewed join
> ---------------------------------------
>                 Key: HIVE-963
>                 URL:
>             Project: Hadoop Hive
>          Issue Type: Improvement
>          Components: Query Processor
>            Reporter: Namit Jain
>            Assignee: Ning Zhang
> Currently, in case of skew, hive runs out of memory.
> A simpler fix would be to use JDBM to store data and use that.
> It can be configurable and JDBM should only be triggered if the number of values for
a given key exceed a given number.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message