drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Omernik <j...@omernik.com>
Subject Re: median, quantile
Date Tue, 07 Jun 2016 13:50:39 GMT
Julian, great point.

With a proper design, uses could session variables or use the select with
options so that one query wouldn't change the session wide settings.   That
seems promising as an idea.

John

On Mon, Jun 6, 2016 at 8:12 PM, Julian Hyde <jhyde@apache.org> wrote:

> I’ve thought for some time that SQL aggregate functions should have an
> “APPROXIMATE ( … )” clause. Users don’t WANT to call a TD_MEDIAN function,
> they want the MEDIAN that gives them an answer to their desired accuracy
> (within X, within Y%, or within a given confidence interval), and TD_MEDIAN
> may be the way to achieve that.
>
> In fact the user might just set “SET APPROXIMATE = ’95%'” in their session
> and the APPROXIMATE clause is implicit on every query they write.
>
> Approximate aggregate functions are all the rage right now but I’m not
> aware of any effort standardize them across databases.
>
> Julian
>
>
> > On Jun 6, 2016, at 5:58 PM, Parth Chandra <parthc@apache.org> wrote:
> >
> > Hey Steven,
> > Somehow I missed this one when you posted it.
> > Since you asked, I would suggest a different name from median, quartile
> > since that might mislead. How about td_median, td_quantile ?
> >
> > On Wed, Apr 13, 2016 at 11:51 AM, Steven Phillips <steven@dremio.com>
> wrote:
> >
> >> I submitted a pull request a little while ago that introduces
> (approximate)
> >> median and quantile functions using the tdigest library.
> >>
> >> https://github.com/apache/drill/pull/456
> >>
> >> It would be great if I could get some feedback on this. Specifically,
> is it
> >> ok to call these functions median and quantile, given that they are not
> >> exact.
> >>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message