hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Remus Rusanu (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HIVE-6614) Vectorized aggregates computed on map side diffe (hash mode) from values computed on reduce side (streaming mode)
Date Tue, 11 Mar 2014 15:25:50 GMT
Remus Rusanu created HIVE-6614:
----------------------------------

             Summary: Vectorized aggregates computed on map side diffe (hash mode) from values
computed on reduce side (streaming mode)
                 Key: HIVE-6614
                 URL: https://issues.apache.org/jira/browse/HIVE-6614
             Project: Hive
          Issue Type: Bug
            Reporter: Remus Rusanu
            Assignee: Remus Rusanu
            Priority: Critical


HIVE-6222 allows vectorized aggregates to operate on streaming mode, ie. flush after each
key change and let the shuffle+reduce side to compute the final aggregate values. An error
in patch .2 for HIVE-6222 shows that when the queries run in streaming mode, there are rounding
diffs for some agg functions (VAR and friends). These occurred for non-decimal types, like
ctinyint:

{code}
select csmallint, VAR_POP(ctinyint) from alltypesorc where csmallint = -75 group by csmallint;
{code}

This produces  107.55555555555556 vs. 107.55555555555554.






--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message