hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "David Phillips (JIRA)" <>
Subject [jira] Updated: (HIVE-165) Add standard statistical functions
Date Fri, 19 Dec 2008 01:01:44 GMT


David Phillips updated HIVE-165:

    Summary: Add standard statistical functions  (was: var(col) built-in to go with avg(col)
and count(col))

> Add standard statistical functions
> ----------------------------------
>                 Key: HIVE-165
>                 URL:
>             Project: Hadoop Hive
>          Issue Type: Wish
>          Components: Query Processor
>            Reporter: Adam Kramer
>            Assignee: David Phillips
>            Priority: Minor
> The last step in the unholy triumvirate of statistical built-ins is the variance. We
already have the n (count) and the mean (avg). I currently have a job or two that filters
all of the data into a single reducer which just computes mean/n/variance and writes it to
a my guess is that this would be a pretty big speed increase. Not a huge deal though,
as computing the variance myself is trivial.
> (Average, variance, and n can be co-computed in one pass, so if you're doing var() you
can basically have avg() and count() for free.)

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message