hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erik Shilts (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HIVE-3711) Create UDAF to calculate an array of Benford's Law
Date Thu, 15 Nov 2012 13:00:12 GMT
Erik Shilts created HIVE-3711:
---------------------------------

             Summary: Create UDAF to calculate an array of Benford's Law
                 Key: HIVE-3711
                 URL: https://issues.apache.org/jira/browse/HIVE-3711
             Project: Hive
          Issue Type: New Feature
          Components: UDF
            Reporter: Erik Shilts
            Priority: Minor


Benford's Law is a useful analytical tool to determine if a number was generated with a random
process by evaluating the relative proportions of the leading digit. It can be used to detect
accounting, financial, and election fraud.

[Wikipedia's|http://en.wikipedia.org/wiki/Benford's_law] Benford's Law page has a good overview.

Hive is well suited to calculate Benford's Law. The result should be a named struct with names
1-9 and values being the corresponding proportions of each digit.

An alternative is to calculate the deviations from Benford's Law for each digit. The structure
of the resulting array would be the same, but the result would be the difference between the
actual proportions and the proportions given the by [formula|http://en.wikipedia.org/wiki/Benford's_law#Mathematical_statement]
on Wikipedia.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message