hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Phabricator (JIRA)" <>
Subject [jira] [Commented] (HIVE-4080) Add Lead & Lag UDAFs
Date Wed, 27 Feb 2013 17:01:16 GMT


Phabricator commented on HIVE-4080:

ashutoshc has requested changes to the revision "HIVE-4080 [jira] Add Lead & Lag UDAFs".

  Some comments.

  ql/src/java/org/apache/hadoop/hive/ql/exec/ Use LEAD_FUNC_NAME
and LAG_FUNC_NAME here to be consistent.
  ql/src/java/org/apache/hadoop/hive/ql/exec/ This null check should
be done outside of nesting if() block.
  ql/src/java/org/apache/hadoop/hive/ql/exec/ I don't see a case
where fInfo is != null but udafResolver could be. If so, than this null check is redundant
and should be removed.
  ql/src/java/org/apache/hadoop/hive/ql/exec/ We are already know
that we are dealing with lead/lag udaf, it must be of type GenericUDAFResolver2.. no ?
  ql/src/java/org/apache/hadoop/hive/ql/exec/ I am wondering if this
function can be rewritten as following:

  WindowFunctionInfo finfo = windowFunctions.get(name.toLowerCase());
  if (finfo == null) { return null;}
  f ( !name.toLowerCase().equals(LEAD_FUNC_NAME) &&
          !name.toLowerCase().equals(LAG_FUNC_NAME) ) {
  return getGenericUDAFEvaluator(name, argumentOIs, isDistinct, isAllColumns);

  // this must be lead/lag UDAF
  GenericUDAFResolver udafResolver = finfo.getfInfo().getGenericUDAFResolver();
  GenericUDAFParameterInfo paramInfo = new SimpleGenericUDAFParameterInfo(
   argumentOIs.toArray(), isDistinct, isAllColumns);
  return  ((GenericUDAFResolver2) udafResolver).getEvaluator(paramInfo);

  If not, than I have specific questions. See next few comments.
  ql/src/java/org/apache/hadoop/hive/ql/exec/ It will be good to
comments for this new boolean registerAsUDAF. Something like following
  There are certain UDAFs like lead/lag which we want as windowing functions, but don't want
them to appear in mFunctions.
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/ This and GenericUDAFLead
share lots of common code. It might be good to have an abstract class for these two, just
the way you have it in GenericUDFLeadLag.
  ql/src/test/queries/clientpositive/leadlag_queries.q:20 I think currently we don't support
over clause on expressions. Once we do, it will be good to add test like:
  select p_retailprice - lead (p_retail) over (partition by p_mfgr) from part;
  ql/src/test/queries/clientpositive/leadlag_queries.q:35 It will be good to add a test which
has both lead and lag in same query.




To: JIRA, ashutoshc, hbutani

> Add Lead & Lag UDAFs
> --------------------
>                 Key: HIVE-4080
>                 URL:
>             Project: Hive
>          Issue Type: Bug
>          Components: PTF-Windowing
>            Reporter: Harish Butani
>            Assignee: Harish Butani
>         Attachments: HIVE-4080.1.patch.txt, HIVE-4080.D8961.1.patch
> Currently we support Lead/Lag as navigation UDFs usable with Windowing.
> To be standard compliant we need to support Lead & Lag UDAFs.
> Will continue to support Lead/Lag UDFs as arguments to UDAFs when Windowing is in play.

> Currently allow Lead/Lag expressions to appear in SelectLists even when they are not
arguments to UDAFs. Support for this feature will probably be removed. Causes ambiguities
when Query contains different partition clauses. Will provide more details with associated
Jira to remove this feature.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:

View raw message