hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "wan kun (JIRA)" <j...@apache.org>
Subject [jira] [Assigned] (HIVE-18362) Introduce a parameter to control the max row number for map join convertion
Date Wed, 03 Jan 2018 08:42:00 GMT

     [ https://issues.apache.org/jira/browse/HIVE-18362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

wan kun reassigned HIVE-18362:
------------------------------


> Introduce a parameter to control the max row number for map join convertion
> ---------------------------------------------------------------------------
>
>                 Key: HIVE-18362
>                 URL: https://issues.apache.org/jira/browse/HIVE-18362
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Processor
>            Reporter: wan kun
>            Assignee: wan kun
>            Priority: Minor
>
> The compression ratio of the Orc compressed file will be very high in some cases.
> The test table has three Int columns, with twelve million records, but the compressed
file size is only 4M. Hive will automatically converts the Join to Map join, but this will
cause memory overflow. So I think it is better to have a parameter to limit to the total number
of table records in the Map Join convertion, and if the total number of records is larger
than that, it can not be converted to Map join.
> *hive.auto.convert.join.max.number = 2500000L*
> The default value for this parameter is 2500000, because so many records occupy about
700M memory in clint JVM, and 2500000 records for Map Join are also large tables.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message