hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gopal V (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HIVE-7402) add `approx_distinct` & composable nDV UDAFs
Date Mon, 14 Jul 2014 21:36:04 GMT
Gopal V created HIVE-7402:
-----------------------------

             Summary: add `approx_distinct` & composable nDV UDAFs
                 Key: HIVE-7402
                 URL: https://issues.apache.org/jira/browse/HIVE-7402
             Project: Hive
          Issue Type: New Feature
            Reporter: Gopal V


Build composable approximate distinct UDAFs into hive.

This is useful for approximate queries, particularly for collapsing partial nDV values whenever
a partition is added.

{code}
hive> select approx_distinct(ss_item_sk), approx_distinct(ss_quantity)  from tpcds_orc_10000.store_sales;

OK
403760  100
Time taken: 238.258 seconds, Fetched: 1 row(s)
{code}

Prototype hive UDAF/UDFs at https://github.com/t3rmin4t0r/hive-hll-udf/

Uses [~prasanth_j]'s fast HLL++ impl for the horsepower.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message