hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gopal V (JIRA)" <>
Subject [jira] [Comment Edited] (HIVE-14246) Tez: disable auto-reducer parallelism when CUSTOM_EDGE is in place
Date Fri, 15 Jul 2016 07:21:20 GMT


Gopal V edited comment on HIVE-14246 at 7/15/16 7:21 AM:

"inconsistency of perf" unfortunately means some good runs too :)

was (Author: gopalv):
"inconsistency of perf" unfortunately means some god runs too :)

> Tez: disable auto-reducer parallelism when CUSTOM_EDGE is in place
> ------------------------------------------------------------------
>                 Key: HIVE-14246
>                 URL:
>             Project: Hive
>          Issue Type: Bug
>          Components: Tez
>    Affects Versions: 2.2.0
>            Reporter: Gopal V
>            Assignee: Gopal V
>            Priority: Minor
>             Fix For: 2.2.0
>         Attachments: HIVE-14246.1.patch
> The CUSTOM_SIMPLE_EDGE impl has differences between the size constraints of either edge
which cannot be represented by the ShuffleVertexManager presently.
> Reducing the width based on the hashtable build side vs the streaming probe side have
different consequences since there is no order of runtime between them.
> Until the two parent vertices of the shuffle hash-join are related, this feature causes
massive inconsistency of performance across runs.
> For inner & semi joins, the hashtable side should have a higher priority than the
streaming side and for left outer joins, the streaming side can over-take the hashtable side,
being the more dominant factor in the final row-counts.
> Until such priorities can be bubbled up into ShuffleVertexManager, this feature can be

This message was sent by Atlassian JIRA

View raw message