[ https://issues.apache.org/jira/browse/HIVE1545?page=com.atlassian.jira.plugin.system.issuetabpanels:commenttabpanel&focusedCommentId=12918094#action_12918094
]
Jonathan Chang commented on HIVE1545:

Sorry about that. I've uploaded a new tarball which should contain the lib directory along
with some new UDFs. UDAFPearson has also been removed since it's been obsoleted by CORR.
> Add a bunch of UDFs and UDAFs
> 
>
> Key: HIVE1545
> URL: https://issues.apache.org/jira/browse/HIVE1545
> Project: Hadoop Hive
> Issue Type: New Feature
> Components: UDF
> Reporter: Jonathan Chang
> Assignee: Jonathan Chang
> Priority: Minor
> Attachments: udfs.tar.gz, udfs.tar.gz
>
>
> Here some UD(A)Fs which can be incorporated into the Hive distribution:
> UDFArgMax  Find the 0indexed index of the largest argument. e.g., ARGMAX(4, 5, 3) returns
1.
> UDFBucket  Find the bucket in which the first argument belongs. e.g., BUCKET(x, b_1,
b_2, b_3, ...), will return the smallest i such that x > b_{i} but <= b_{i+1}. Returns
0 if x is smaller than all the buckets.
> UDFFindInArray  Finds the 1index of the first element in the array given as the second
argument. Returns 0 if not found. Returns NULL if either argument is NULL. E.g., FIND_IN_ARRAY(5,
array(1,2,5)) will return 3. FIND_IN_ARRAY(5, array(1,2,3)) will return 0.
> UDFGreatCircleDist  Finds the great circle distance (in km) between two lat/long coordinates
(in degrees).
> UDFLDA  Performs LDA inference on a vector given fixed topics.
> UDFNumberRows  Number successive rows starting from 1. Counter resets to 1 whenever
any of its parameters changes.
> UDFPmax  Finds the maximum of a set of columns. e.g., PMAX(4, 5, 3) returns 5.
> UDFRegexpExtractAll  Like REGEXP_EXTRACT except that it returns all matches in an array.
> UDFUnescape  Returns the string unescaped (using C/Java style unescaping).
> UDFWhich  Given a boolean array, return the indices which are TRUE.
> UDFJaccard
> UDAFCollect  Takes all the values associated with a row and converts it into a list.
Make sure to have: set hive.map.aggr = false;
> UDAFCollectMap  Like collect except that it takes tuples and generates a map.
> UDAFEntropy  Compute the entropy of a column.
> UDAFPearson (BROKEN!!!)  Computes the pearson correlation between two columns.
> UDAFTop  TOP(KEY, VAL)  returns the KEY associated with the largest value of VAL.
> UDAFTopN (BROKEN!!!)  Like TOP except returns a list of the keys associated with the
N (passed as the third parameter) largest values of VAL.
> UDAFHistogram

This message is automatically generated by JIRA.

You can reply to this email to add a comment to the issue online.
