hadoop-hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mayank Lahiri (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HIVE-1549) Add ANSI SQL correlation aggregate function CORR(X,Y).
Date Wed, 18 Aug 2010 20:57:16 GMT

    [ https://issues.apache.org/jira/browse/HIVE-1549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12900029#action_12900029
] 

Mayank Lahiri commented on HIVE-1549:
-------------------------------------

Nice job Pierre! Just a couple of very trivial points:

-- UDAF file, line #116 and line #123, could you amend the error message to indicate that
only numeric types are accepted (string is also included as of now).

-- I don't think you need the private boolean warned, line #273

Otherwise, it looks good and the numbers work out.
 

Incidentally, for the future, if your UDAF only stores a small number of values as a partial
aggregation, you might just want to consider serializing the values as a list of doubles instead
of a struct in terminatePartial() and merge(). It'll probably save you some time and reduce
the amount of code in those parts. 

> Add ANSI SQL correlation aggregate function CORR(X,Y).
> ------------------------------------------------------
>
>                 Key: HIVE-1549
>                 URL: https://issues.apache.org/jira/browse/HIVE-1549
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Query Processor
>    Affects Versions: 0.7.0
>            Reporter: Pierre Huyn
>            Assignee: Pierre Huyn
>             Fix For: 0.7.0
>
>         Attachments: HIVE-1549.1.patch
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> Aggregate function that computes the Pearson's coefficient of correlation between a set
of number pairs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message