Erik Shilts created HIVE3711:

Summary: Create UDAF to calculate an array of Benford's Law
Key: HIVE3711
URL: https://issues.apache.org/jira/browse/HIVE3711
Project: Hive
Issue Type: New Feature
Components: UDF
Reporter: Erik Shilts
Priority: Minor
Benford's Law is a useful analytical tool to determine if a number was generated with a random
process by evaluating the relative proportions of the leading digit. It can be used to detect
accounting, financial, and election fraud.
[Wikipedia'shttp://en.wikipedia.org/wiki/Benford's_law] Benford's Law page has a good overview.
Hive is well suited to calculate Benford's Law. The result should be a named struct with names
19 and values being the corresponding proportions of each digit.
An alternative is to calculate the deviations from Benford's Law for each digit. The structure
of the resulting array would be the same, but the result would be the difference between the
actual proportions and the proportions given the by [formulahttp://en.wikipedia.org/wiki/Benford's_law#Mathematical_statement]
on Wikipedia.

