hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Remus Rusanu (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HIVE-6222) Make Vector Group By operator abandon grouping if too many distinct keys
Date Tue, 11 Mar 2014 15:19:42 GMT

     [ https://issues.apache.org/jira/browse/HIVE-6222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Remus Rusanu updated HIVE-6222:
-------------------------------

    Status: Patch Available  (was: Open)

.3.Patch addresses the test failures. Incorrect comparison in checkHashEfficiency was triggering
switch to streaming mode on first row processed. While the fix addresses the problem, the
results diff also showed that there are rounding diffs between streamign mode (agg done using
UnsignedInt128) vs. streaming mode (agg done on reduce side, using HiveDecimal). This is similar
to the issues HIVE-6511 exposed and I'll open a separate JIRA to address it.

> Make Vector Group By operator abandon grouping if too many distinct keys
> ------------------------------------------------------------------------
>
>                 Key: HIVE-6222
>                 URL: https://issues.apache.org/jira/browse/HIVE-6222
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Query Processor
>    Affects Versions: 0.13.0
>            Reporter: Remus Rusanu
>            Assignee: Remus Rusanu
>            Priority: Minor
>              Labels: vectorization
>         Attachments: HIVE-6222.1.patch, HIVE-6222.2.patch, HIVE-6222.3.patch
>
>
> Row mode GBY is becoming a pass-through if not enough aggregation occurs on the map side,
relying on the shuffle+reduce side to do the work. Have VGBY do the same.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message