hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gunther Hagleitner (JIRA)" <>
Subject [jira] [Commented] (HIVE-10323) Tez merge join operator does not honor hive.join.emit.interval
Date Tue, 21 Apr 2015 19:21:00 GMT


Gunther Hagleitner commented on HIVE-10323:

Patch looks good. Minor nit: The condition for nextKeyGroup should be an else block.

Some other considerations:

- Maybe we should log emit and spill intervals. Also warn if the first is > than latter?
- Looks like you emit before you put the current record into storage. Wouldn't it be better
to do that afterwards?

Biggest concern: There's not a lot of testing going on. For one thing I think you could set
the emit interval low (2?) for all tez tests and see if you get bigger coverage that way.
If not you should test all the combinations: left, right, outer, multi key, multi table, spill
other tables, etc.

> Tez merge join operator does not honor hive.join.emit.interval
> --------------------------------------------------------------
>                 Key: HIVE-10323
>                 URL:
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Processor
>    Affects Versions: 1.2.0
>            Reporter: Vikram Dixit K
>            Assignee: Vikram Dixit K
>         Attachments: HIVE-10323.1.patch
> This affects efficiency in case of skews.

This message was sent by Atlassian JIRA

View raw message