hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zoltan Haindrich (JIRA)" <>
Subject [jira] [Commented] (HIVE-15978) Support regr_* functions
Date Fri, 10 Mar 2017 00:14:38 GMT


Zoltan Haindrich commented on HIVE-15978:

[~pxiong] I see multiple ways this could be achieved...and I'm not sure which one to take

Most of these functions (more/or less) could be translated into existing UDAF function usage
- it needs some tweaking; but it can be done; I don't really want to reimplement all those
things again - I think it would be better to reuse them.

# if I create some 'cover' UDAF evaluators for each of these functions and do the evaluation
of those inside the new evaluator - that could work; but it will be quite a few very similar
# tho other alternative is to add some slightly extended versions of some existing UDAFs (like:count
and variance) - and rewrite somehow the {{regr_sxx(y,x)}} invocations to {{extended_COUNT(x,
y) * extended_VAR_POP( y )}}

I guess from here that the 1. alternative may give slightly better runtimes - but not significantly;
but in the 2. case the "original" evalutators would do the real work

about why do I need to change a bit the existing UDAFs: all these regr_* functions are required
to only do any work when neither of {{x}} and {{y}} is null ({{regr_sxx(x,y)}})

> Support regr_* functions
> ------------------------
>                 Key: HIVE-15978
>                 URL:
>             Project: Hive
>          Issue Type: Sub-task
>          Components: SQL
>            Reporter: Carter Shanklin
>            Assignee: Zoltan Haindrich
> Support the standard regr_* functions, regr_slope, regr_intercept, regr_r2, regr_sxx,
regr_syy, regr_sxy, regr_avgx, regr_avgy, regr_count. SQL reference section 10.9

This message was sent by Atlassian JIRA

View raw message