hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Harish Butani (JIRA)" <>
Subject [jira] [Commented] (HIVE-896) Add LEAD/LAG/FIRST/LAST analytical windowing functions to Hive.
Date Fri, 01 Feb 2013 03:17:16 GMT


Harish Butani commented on HIVE-896:


So the plan looks like this:

... -> ReduceSink -> Extract -> PTFOp

We don't know what columns a PTF will access; the contract is it has access to all columns
in its input. So we don't want any Column Pruning to happen. So we don't put a Select Op before
the Reduce Sink. At translation time we see all the Columns, including the VCs. It appears
as though during optimization VCs are carried forward only if required; so at runtime the
ColumnExprNodeDescs are referring to the wrong internalNames. Does this make sense? Is there
a way to carry forward the VCs when a PTF is present. The other option is (which we have taken
is) to say VCs are not available to PTFs.

Having said this, when the PTF is Windowing, we do know the columns being referred; so we
should put a Select Op in front of the ReduceSink.
> Add LEAD/LAG/FIRST/LAST analytical windowing functions to Hive.
> ---------------------------------------------------------------
>                 Key: HIVE-896
>                 URL:
>             Project: Hive
>          Issue Type: New Feature
>          Components: OLAP, UDF
>            Reporter: Amr Awadallah
>            Priority: Minor
>         Attachments: DataStructs.pdf, HIVE-896.1.patch.txt, Hive-896.2.patch.txt
> Windowing functions are very useful for click stream processing and similar time-series/sliding-window
> More details at:
> -- amr

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:

View raw message