hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "John Sichi (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HIVE-1372) New algorithm for variance() UDAF
Date Wed, 02 Jun 2010 00:55:42 GMT

    [ https://issues.apache.org/jira/browse/HIVE-1372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12874352#action_12874352
] 

John Sichi commented on HIVE-1372:
----------------------------------

Mayank, I'm getting test failures due to trailing digit diffs when running through groupby3.q
and groupby3_noskew.q.  If they are OK, you can just submit a new patch including an updated
.q.out for both tests.

    [junit] diff -a -I file: -I /tmp/ -I invalidscheme: -I lastUpdateTime -I lastAccessTime
-I owner -I transient_lastDdlTime -I java.lang.RuntimeException -I at org -I at sun -I at
java -I at junit -I Caused by: -I [.][.][.] [0-9]* more /data/users/jsichi/open/commit-trunk/build/ql/test/logs/clientpositive/groupby3.q.out
/data/users/jsichi/open/commit-trunk/ql/src/test/results/clientpositive/groupby3.q.out
    [junit] 240c240
    [junit] < 130091.0	260.182	256.10355987055016	98.0	0.0	142.92680950752379	143.06995106518903
20428.07287599999	20469.010897795582
    [junit] ---
    [junit] > 130091.0	260.182	256.10355987055016	98.0	0.0	142.92680950752379	143.06995106518903
20428.072875999995	20469.010897795586


> New algorithm for variance() UDAF
> ---------------------------------
>
>                 Key: HIVE-1372
>                 URL: https://issues.apache.org/jira/browse/HIVE-1372
>             Project: Hadoop Hive
>          Issue Type: Improvement
>          Components: Query Processor
>    Affects Versions: 0.6.0
>            Reporter: Mayank Lahiri
>            Assignee: Mayank Lahiri
>            Priority: Minor
>             Fix For: 0.6.0
>
>         Attachments: HIVE-1372.2.patch, HIVE-1372.patch
>
>
> A new algorithm for the UDAF that computes variance. This is pretty much a drop-in replacement
for the current UDAF, and has two benefits: provably numerically stable (reference included
in comments), and reduces arithmetic operations by about half.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message