hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aihua Xu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-9228) Problem with subquery using windowing functions
Date Mon, 09 Feb 2015 19:02:35 GMT

    [ https://issues.apache.org/jira/browse/HIVE-9228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14312646#comment-14312646
] 

Aihua Xu commented on HIVE-9228:
--------------------------------

 [~navis] Thanks for the update. I'm new to Hive and was not very comfortable with my change
myself. Thanks for your breaking in. :)

> Problem with subquery using windowing functions
> -----------------------------------------------
>
>                 Key: HIVE-9228
>                 URL: https://issues.apache.org/jira/browse/HIVE-9228
>             Project: Hive
>          Issue Type: Bug
>          Components: PTF-Windowing
>    Affects Versions: 0.13.1
>            Reporter: Aihua Xu
>            Assignee: Aihua Xu
>         Attachments: HIVE-9228.1.patch.txt, HIVE-9228.2.patch.txt, HIVE-9228.3.patch.txt,
create_table_tab1.sql, tab1.csv
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> The following query with window functions failed. The internal query works fine.
> select col1, col2, col3 from (select col1,col2, col3, count(case when col4=1 then 1 end
) over (partition by col1, col2) as col5, row_number() over (partition by col1, col2 order
by col4) as col6 from tab1) t;
> HIVE generates an execution plan with 2 jobs. 
> 1. The first job is to basically calculate window function for col5.  
> 2. The second job is to calculate window function for col6 and output.
> The plan says the first job outputs the columns (col1, col2, col3, col4) to a tmp file
since only these columns are used in later stage. While, the PTF operator for the first job
outputs (_wcol0, col1, col2, col3, col4) with _wcol0 as the result of the window function
even it's not used. 
> In the second job, the map operator still reads the 4 columns (col1, col2, col3, col4)
from the temp file using the plan. That causes the exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message