phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Julian Hyde (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PHOENIX-3390) Custom UDAF for HyperLogLogPlus
Date Sun, 25 Jun 2017 22:29:00 GMT

    [ https://issues.apache.org/jira/browse/PHOENIX-3390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16062445#comment-16062445
] 

Julian Hyde commented on PHOENIX-3390:
--------------------------------------

[~gjacoby], Thanks for the heads up. Even more pertinent than my data profiling work (which
is about the problem of computing 2^n approximate distinct-counts simultaneously, see CALCITE-1616)
is people's requirement to have fast, approximate distinct count. Druid supports various sketches,
and we wish to surface them in Calcite's Druid adapter (see CALCITE-1787 theta-sketch, CALCITE-1587
top-N, CALCITE-1853 knowing when approximate count-distinct is acceptable).

Today many databases have a syntax for approximate aggregates, and unfortunately the syntaxes
are rarely the same and are often too closely coupled to a particular algorithm (e.g HyperLogLog).
I have logged CALCITE-1588 to introduce an {{APPROXIMATE}} clause, e.g. {{COUNT(DISTINCT customerId)
APPROXIMATE (WITHIN 10 PERCENT))}}. It would be great if Phoenix wants to go with that syntax.

> Custom UDAF for HyperLogLogPlus
> -------------------------------
>
>                 Key: PHOENIX-3390
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-3390
>             Project: Phoenix
>          Issue Type: New Feature
>            Reporter: Swapna Kasula
>            Assignee: Ethan Wang
>            Priority: Minor
>
> With ref # PHOENIX-2069
> Custome UDAF to aggregate/union of Hyperloglog's of a column and returns a Hyperloglog.
> select hllUnion(col1) from table;  //returns a Hyperloglog, which is the union of all
hyperloglog's from all rows for column 'col1'



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message