hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Pi Song (JIRA)" <j...@apache.org>
Subject [jira] Commented: (PIG-277) UDF for computing correlation and covariance between data sets
Date Tue, 24 Jun 2008 12:34:45 GMT

    [ https://issues.apache.org/jira/browse/PIG-277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12607602#action_12607602
] 

Pi Song commented on PIG-277:
-----------------------------

Since we're not gonna do correlation between too many datasets I think we could just do

x*x - x - n =0   =>  ( x )(x-1) = n

and then keep substituting x by 1,2,3, ...  until it's got a match or ( x )(x-1) > n 
we can do it like a binary search to get O(log n) if you like but won't get much of out it.

> UDF for computing correlation and covariance between data sets
> --------------------------------------------------------------
>
>                 Key: PIG-277
>                 URL: https://issues.apache.org/jira/browse/PIG-277
>             Project: Pig
>          Issue Type: New Feature
>            Reporter: Ajay Garg
>            Priority: Minor
>         Attachments: stat.patch
>
>
> UDFs for computing correlation and covariance between data sets. Use following commands
to compute covariance
> A = load 'input.xml' using PigStorage(':');
> B = group A all;
> define c COV('a','b','c');
> D = foreach B generate group,c(A.$0,A.$1,A.$2);

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message