hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ashutosh Chauhan <hashut...@apache.org>
Subject Re: Window function possible perf improvement
Date Thu, 07 May 2015 14:20:49 GMT
Harish has done some good work for popular use-case of windowing on
https://issues.apache.org/jira/browse/HIVE-7062 which are available from
0.14 onwards. Will that be useful in your scenario? Or, are you targeting
non-windowing PTFs?

Thanks,
Ashutosh

On Thu, May 7, 2015 at 6:43 AM, Sivaramakrishnan Narayanan <
tarball@gmail.com> wrote:

> Hi,
>
> I was reading through the PTFOperator and related code and was wondering if
> there is an opportunity to optimize this function in
> WindowingTableFunction.java
>
>   public void execute(PTFPartitionIterator<Object> pItr, PTFPartition
> outP) throws HiveException {
>
>  This guy iterates over the input partition once to compute outputColumns.
> This causes a full read of input partition.
>
> It then iterates over input partition again to append newly computed
> values. This causes another read of input partition and a write to output
> partition.
>
> I was wondering if it may be more efficient to append to the output
> partition as soon as window expressions have been computed. This will avoid
> one scan of the input partition.
>
> FYI - I've been looking at hive 0.13 code mostly but a glance at trunk
> suggests this logic is the same there.
>
> Thanks,
>
> Siva
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message