hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ferdinand Xu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-17783) Hybrid Grace Hash Join has performance degradation for N-way join using Hive on Tez
Date Fri, 13 Oct 2017 02:25:00 GMT

    [ https://issues.apache.org/jira/browse/HIVE-17783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16202985#comment-16202985
] 

Ferdinand Xu commented on HIVE-17783:
-------------------------------------

Logically it should have some performance benefits over the non hybrid grace hash join since
it didn't need to rescan the big table again during the reprocessing phase when hash table
can not fit into the memory.

> Hybrid Grace Hash Join has performance degradation for N-way join using Hive on Tez
> -----------------------------------------------------------------------------------
>
>                 Key: HIVE-17783
>                 URL: https://issues.apache.org/jira/browse/HIVE-17783
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 2.2.0
>         Environment: 8*Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz
> 1 master + 7 workers
> TPC-DS at 3TB data scales
> Hive version : 2.2.0
>            Reporter: Ferdinand Xu
>         Attachments: Hybrid_Grace_Hash_Join.xlsx, screenshot-1.png
>
>
> Most configurations are using default value. And the benchmark is to test enabling against
disabling hybrid grace hash join using TPC-DS queries at 3TB data scales. Many queries related
to N-way join has performance degradation over three times test. Detailed result  is attached.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message